Skip to main content
Physics LibreTexts

9.4: More on 4-vectors and 4-tensors

  • Page ID
    57038
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    This is a good moment to introduce a formalism that will allow us, in particular, to solve the same proton collision problem in one more (and arguably, the most elegant) way. Much more importantly, this formalism will be virtually necessary for the description of the Lorentz transform of the electromagnetic field, and its interaction with relativistic particles – otherwise the formulas would be too cumbersome.

    Let us call the 4-vectors we have used before,

    \[\ \text{Contravariant 4-vectors}\quad\quad\quad\quad A^{\alpha} \equiv\left\{A_{0}, \mathbf{A}\right\},\tag{9.84}\]

    contravariant, and denote them with top indices, and introduce also covariant vectors,

    \[\ \text{Covariant 4-vectors}\quad\quad\quad\quad A_{\alpha} \equiv\left\{A_{0},-\mathbf{A}\right\},\tag{9.85}\]

    marked by lower indices. Now if we form a scalar product of these two vectors using the standard (3D-like) rule, just as a sum of the products of the corresponding components, we immediately get

    \[\ A_{\alpha} A^{\alpha} \equiv A^{\alpha} A_{\alpha} \equiv A_{0}^{2}-A^{2}.\tag{9.86}\]

    Here and below the sign of the sum of four components of the product has been dropped.37 The scalar product (86) is just the norm of the 4-vector in our former definition, and as we already know, is Lorentz-invariant. Moreover, the scalar product of two different vectors (also a Lorentz invariant), may be rewritten in any of two similar forms:38

    \[\ \text{Scalar product's forms}\quad\quad\quad\quad A_{0} B_{0}-\mathbf{A} \cdot \mathbf{B} \equiv A_{\alpha} B^{\alpha}=A^{\alpha} B_{\alpha};\tag{9.87}\]

    again, the only caveat is to take one vector in the covariant, and the other one in the contravariant form.

    Now let us return to our sample problem (Fig. 10). Since all components (\(\ \mathscr{E}/c\) and \(\ \mathbf{p}\)) of the total 4-momentum of our system are conserved at the collision, its norm is conserved as well:

    \[\ \left(p_{a}+p_{b}\right)_{\alpha}\left(p_{a}+p_{b}\right)^{\alpha}=(4 p)_{\alpha}(4 p)^{\alpha}.\tag{9.88}\]

    Since now the vector product is the usual math construct, we know that the parentheses on the left-hand side of this equation may be multiplied as usual. We may also swap the operands and move constant factors through products as convenient. As a result, we get

    \[\ \left(p_{a}\right)_{\alpha}\left(p_{a}\right)^{\alpha}+\left(p_{b}\right)_{\alpha}\left(p_{b}\right)^{\alpha}+2\left(p_{a}\right)_{\alpha}\left(p_{b}\right)^{\alpha}=16 p_{\alpha} p^{\alpha}.\tag{9.89}\]

    Thanks to the Lorentz invariance of each of the terms, we may calculate it in the reference frame we like. For the first two terms on the left-hand side, as well as for the right hand side term, it is beneficial to use the frames in which that particular proton is at rest; as a result, according to Eq. (77b), each of the two left-hand-side terms equals \(\ (m c)^{2}\), while the right-hand side equals \(\ 16(m c)^{2}\). On the contrary, the last term on the left-hand side is more easily evaluated in the lab frame, because in it, the three spatial components of the 4-momentum \(\ p_{b}\) vanish, and the scalar product is just the product of the scalars \(\ \mathscr{E} / c\) for protons \(\ a\) and \(\ b\). For the latter proton, being at rest, this ratio is just \(\ mc\) so that we get a simple equation,

    \[\ (m c)^{2}+(m c)^{2}+2 \frac{\mathscr{E}_{\min }}{c} m c=16(m c)^{2},\tag{9.90}\]

    immediately giving the final result \(\ \mathscr{E}_{\min }=7 m c^{2}\), already obtained earlier in two more complex ways.

    Let me hope that this example was a convincing demonstration of the convenience of representing 4-vectors in the contravariant (84) and covariant (85) forms,39 with Lorentz invariant norms (86). To be useful for more complex tasks, this formalism should be developed a little bit further. In particular, it is crucial to know how the 4-vectors change under the Lorentz transform. For contravariant vectors, we already know the answer (54); let us rewrite it in our new notation:

    \[\ A^{\alpha}=L_{\beta}^{\alpha} A^{\prime \beta}.\quad\quad\quad\quad\text{Lorentz transform: contravariant vectors}\tag{9.91}\]

    where \(\ L_{\beta}^{\alpha}\) is the matrix (51), generally called the mixed Lorentz tensor:40

    \[\ L_{\beta}^{\alpha}=\left(\begin{array}{cccc}
    \gamma & \beta \gamma & 0 & 0 \\
    \beta \gamma & \gamma & 0 & 0 \\
    0 & 0 & 1 & 0 \\
    0 & 0 & 0 & 1
    \end{array}\right),\quad\quad\quad\quad\text{Mixed Lorentz tensor}\tag{9.92}\]

    Note that though the position of the indices \(\ \alpha\) and \(\ \beta\) in the Lorentz tensor notation is not crucial, because this tensor is symmetric, it is convenient to place them using the general index balance rule: the difference of the numbers of the upper and lower indices should be the same in both parts of any 4-vector/tensor equality. (Please check yourself that all the formulas above do satisfy this rule.)

    In order to rewrite Eq. (91) in a more general form that would not depend on the particular orientation of the coordinate axes (Fig. 1), let us use the contravariant and covariant forms of the 4-vector of the time-space interval (57),

    \[\ d x^{\alpha}=\{c d t, d \mathbf{r}\}, \quad d x_{\alpha}=\{c d t,-d \mathbf{r}\};\tag{9.93}\]

    then its norm (58) may be represented as41

    \[\ (d s)^{2} \equiv(c d t)^{2}-(d r)^{2}=d x^{\alpha} d x_{\alpha}=d x_{\alpha} d x^{\alpha}.\tag{9.94}\]

    Applying Eq. (91) to the first, contravariant form of the 4-vector (93), we get

    \[\ d x^{\alpha}=L_{\beta}^{\alpha} d x^{\prime \beta}.\tag{9.95}\]

    But with our new shorthand notation, we can also write the usual rule of differentiation of each component \(\ x^{\alpha}\), considering it as a function (in our case, linear) of four arguments \(\ x^{, \beta}\), as follows:42

    \[\ d x^{\alpha}=\frac{\partial x^{\alpha}}{\partial x^{\prime \beta}} d x^{\prime \beta}.\tag{9.96}\]

    Comparing Eqs. (95) and (96), we can rewrite the general Lorentz transform rule (92) in a new form,

    \[\ \text{Lorentz transform: general form}\quad\quad\quad\quad A^{\alpha}=\frac{\partial x^{\alpha}}{\partial x^{\prime \beta}} A^{\prime \beta}.\tag{9.97a}\]

    which does not depend on the coordinate axes’ orientation.

    It is straightforward to verify that the reciprocal transform may be represented as

    \[\ \text{Reciprocal Lorentz transform}\quad\quad\quad\quad A^{\prime \alpha}=\frac{\partial x^{\prime \alpha}}{\partial x^{\beta}} A^{\beta}.\tag{9.97b}\]

    However, the reciprocal transform has to differ from the direct one only by the sign of the relative velocity of the frames, so that for the coordinate choice shown in Fig. 1, its matrix is

    \[\ \frac{\partial x^{\prime \alpha}}{\partial x^{\beta}}=\left(\begin{array}{cccc}
    \gamma & -\beta \gamma & 0 & 0 \\
    -\beta \gamma & \gamma & 0 & 0 \\
    0 & 0 & 1 & 0 \\
    0 & 0 & 0 & 1
    \end{array}\right),\tag{9.98}\]

    Since according to Eqs. (84)-(85), covariant 4-vectors differ from the contravariant ones by the sign of their spatial components, their direct transform is given by the matrix (98). Hence their direct and reciprocal transforms may be represented, respectively, as

    \[\ A_{\alpha}=\frac{\partial x^{\prime \beta}}{\partial x^{\alpha}} A_{\beta}^{\prime}, \quad A_{\alpha}^{\prime}=\frac{\partial x^{\beta}}{\partial x^{\prime \alpha}} A_{\beta},\text{Lorentz transform: covariant vectors}\tag{9.99}\]

    evidently satisfying the index balance rule. (Note that primed quantities are now multiplied, rather than divided as in the contravariant case.) As a sanity check, let us apply this formalism to the scalar product \(\ A_{\alpha} A^{\alpha}\). As Eq. (96) shows, the implicit-sum notation allows us to multiply and divide any equality by the same partial differential of a coordinate, so that we can write:

    \[\ A_{\alpha} A^{\alpha}=\frac{\partial x^{\prime \beta}}{\partial x^{\alpha}} \frac{\partial x^{\alpha}}{\partial x^{\prime \gamma}} A_{\beta}^{\prime} A^{\prime \gamma}=\frac{\partial x^{\prime \beta}}{\partial x^{\prime \gamma}} A_{\beta}^{\prime} A^{\prime \gamma}=\delta_{\beta \gamma} A_{\beta}^{\prime} A^{\prime \gamma}=A_{\gamma}^{\prime} A^{\prime \gamma},\tag{9.100}\]

    i.e. the scalar product \(\ A_{\alpha} A^{\alpha}\) (as well as \(\ A^{\alpha} A_{\alpha}\)) is Lorentz-invariant, as it should be.

    Now, let us consider the 4-vectors of derivatives. Here we should be very careful. Consider, for example, the following 4-vector operator

    \[\ \frac{\partial}{\partial x^{\alpha}} \equiv\left\{\frac{\partial}{\partial(c t)}, \nabla\right\},\tag{9.101}\]

    As was discussed above, the operator is not changed by its multiplication and division by another differential, e.g., \(\ \partial x^{, \beta}\) (with the corresponding implied summation over all four values of \(\ \beta\)), so that

    \[\ \frac{\partial}{\partial x^{\alpha}}=\frac{\partial x^{\prime \beta}}{\partial x^{\alpha}} \frac{\partial}{\partial x^{\prime \beta}}.\tag{9.102}\]

    But, according to the first of Eqs. (99), this is exactly how the covariant vectors are Lorentz-transformed! Hence, we have to consider the derivative over a contravariant space-time interval as a covariant 4-vector, and vice versa.43 (This result might be also expected from the index balance rule.) In particular, this means that the scalar product

    \[\ \frac{\partial}{\partial x^{\alpha}} A^{\alpha} \equiv \frac{\partial A_{0}}{\partial(c t)}+\nabla \cdot \mathbf{A}\tag{9.103}\]

    should be Lorentz-invariant for any legitimate 4-vector. A convenient shorthand for the covariant derivative, which complies with the index balance rule, is

    \[\ \frac{\partial}{\partial x^{\alpha}} \equiv \partial_{\alpha},\tag{9.104}\]

    so that the invariant scalar product may be written just as \(\ \partial_{\alpha} A^{\alpha}\). A similar definition of the contravariant derivative,

    \[\ \partial^{\alpha} \equiv \frac{\partial}{\partial x_{\alpha}}=\left\{\frac{\partial}{\partial(c t)},-\nabla\right\},\tag{9.105}\]

    allows us to write the Lorentz-invariant scalar product (103) in any of the following two forms:

    \[\ \frac{\partial A_{0}}{\partial(c t)}+\nabla \cdot \mathbf{A}=\partial^{\alpha} A_{\alpha}=\partial_{\alpha} A^{\alpha}.\tag{9.106}\]

    Finally, let us see how does the general Lorentz transform change 4-tensors. A second-rank \(\ 4 \times 4\) matrix is a legitimate 4-tensor if the 4-vectors it relates obey the Lorentz transform. For example, if two legitimate 4-vectors are related as

    \[\ A^{\alpha}=T^{\alpha \beta} B_{\beta},\tag{9.107}\]

    we should require that

    \[\ A^{\prime \alpha}=T^{\prime \alpha \beta} B_{\beta}^{\prime},\tag{9.108}\]

    where \(\ A^{\alpha}\) and \(\ A^{, \alpha}\) are related by Eqs. (97), while \(\ B_{\beta}\) and \(\ B_{\beta}^{\prime}\), by Eqs. (99). This requirement immediately
    yields

    \[\ \text{Lorentz transform of 4-tensors}\quad\quad\quad\quad T^{\alpha \beta}=\frac{\partial x^{\alpha}}{\partial x^{\prime \gamma}} \frac{\partial x^{\beta}}{\partial x^{\prime \delta}} T^{\prime \gamma \delta}, \quad T^{\prime \alpha \beta}=\frac{\partial x^{\prime \alpha}}{\partial x^{\gamma}} \frac{\partial x^{\prime \beta}}{\partial x^{\delta}} T^{\gamma \delta},\tag{9.109}\]

    with the implied summation over two indices, \(\ \gamma\) and \(\ \delta\). The rules for the covariant and mixed tensors are similar.44


    Reference

    37 This compact notation may take some time to be accustomed to, but is very convenient (compact) and can hardly lead to any confusion, due to the following rule: the summation is implied when (and only when) an index is repeated twice, once on the top and another at the bottom. In this course, this shorthand notation will be used only for 4-vectors, but not for the usual (spatial) vectors.

    38 Note also that, by definition, for any two 4-vectors, \(\ A_{\alpha} B^{\alpha}=B^{\alpha} A_{\alpha}\).

    39 These forms are 4-vector extensions of the notions of contravariance and covariance, introduced in the 1850s by J. Sylvester (who also introduced the term “matrix”) for the description of the change of the usual geometric (3-component) vectors at the transfer between different reference frames – e.g., resulting from the frame rotation. In this case, the contravariance or covariance of a vector is uniquely determined by its nature: if the Cartesian coordinates of a vector (such as the non-relativistic velocity \(\ \mathbf{v}=d \mathbf{r} / d t\)) are transformed similarly to the radius-vector \(\ \mathbf{r}\), it is called contravariant, while the vectors (such as \(\ \nabla f\)) that require the reciprocal transform, are called covariant. In the Minkowski space, both forms may be used for any 4-vector.

    40 Just as the 4-vectors, 4-tensors with two top indices are called contravariant, and those with two bottom indices, covariant. The tensors with one top and one bottom index are called mixed.

    41 Another way to write this relation is \(\ (d s)^{2}=g_{\alpha \beta} d x^{\alpha} d x^{\beta}=g^{\alpha \beta} d x_{\alpha} d x_{\beta}\), where double summation over indices \(\ \alpha\) and \(\ \beta\) is implied, and \(\ g\) is the so-called metric tensor,

    \(\ g^{\alpha \beta} \equiv g_{\alpha \beta} \equiv\left(\begin{array}{cccc}
    1 & 0 & 0 & 0 \\
    0 & -1 & 0 & 0 \\
    0 & 0 & -1 & 0 \\
    0 & 0 & 0 & -1
    \end{array}\right),\)

    which may be used, in particular, to transfer a covariant vector into the corresponding contravariant one and back: \(\ A^{\alpha}=g^{\alpha \beta} A_{\beta}\), \(\ A_{\alpha}=g_{\alpha \beta} A^{\beta}\). The metric tensor plays a key role in general relativity, in which it is affected by gravity – “curved” by particles’ masses.

    42 Note that in the index balance rule, the top index in the denominator of a fraction is counted as a bottom index in the numerator, and vice versa.

    43 As was mentioned above, this is also a property of the reference-frame transform of the “usual” 3D vectors.

    44 It is straightforward to check that transfer between the contravariant and covariant forms of the same tensor may be readily achieved using the metric tensor \(\ g\): \(\ T_{\alpha \beta}=g_{\alpha \gamma} T^{\gamma \delta} g_{\delta \beta}, T^{\alpha \beta}=g^{\alpha \gamma} T_{\gamma \delta} g^{\delta \beta}\).


    This page titled 9.4: More on 4-vectors and 4-tensors is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Konstantin K. Likharev via source content that was edited to the style and standards of the LibreTexts platform.