9.3: 4-vectors, Momentum, Mass, and Energy
( \newcommand{\kernel}{\mathrm{null}\,}\)
Before proceeding to the relativistic dynamics, let us discuss the mathematical formalism that makes all the calculations more compact – and more beautiful. We have already seen that the three spatial coordinates {x,y,z} and the product ct are Lorentz-transformed similarly – see Eqs. (18)-(19) again. So it is natural to consider them as components of a single four-component vector (or, for short, 4-vector),
{x0,x1,x2,x3}≡{ct,r},
with components
x0≡ct,x1≡x,x2≡y,x3≡z.Space-time 4-vector
According to Eqs. (19), its components are Lorentz-transformed as
xj=3∑j′=0Ljj′x′j′,Lorentz transform: 4-form
where Ljj′ are the elements of the following 4×4 Lorentz transform matrix
(γβγ00βγγ0000100001).Lorentz transform matrix
Since such 4-vectors are a new notion for this course and will be used for many more purposes than just the space-time transform, we need to discuss the general mathematical rules they obey. Indeed, as was already mentioned in Sec. 8.9, the usual (three-component) vector is not just any ordered set (string) of three scalars {Ax,Ay,Az}; if we want it to represent a reference-frame-independent physical reality, the vector’s components have to obey certain rules at the transfer from one reference frame to another. In particular, in the non-relativistic limit the vector’s norm (its magnitude squared),
A2=A2x+A2y+A2z,
should be invariant with respect to the transfer between different reference frames. However, a naïve extension of this approach to 4-vectors would not work, because, according to the calculations of Sec. 1, the Lorentz transform keeps intact the combinations of the type (7), with one sign negative, rather than the sum of all components squared. Hence for the 4-vectors, all the rules of the game have to be reviewed and adjusted – or rather redefined from the very beginning.
An arbitrary 4-vector is a string of 4 scalars,23
General 4-vector{A0,A1,A2,A3},
whose components Aj, as measured in the systems 0 and 0’ shown in Fig. 1, obey the Lorentz transform relations similar to Eq. (50):
Lorentz transform: general 4-vectorAj=3∑j′=0Ljj′A′j′.
As we have already seen on the example of the space-time 4-vector (48), this means in particular that
Lorentz invarianceA20−3∑j=1A2j=(A′0)2−3∑j=1(A′j)2.
This is the so-called Lorentz invariance condition for the 4-vector’s norm. (The difference between this relation and Eq. (52), pertaining to the Euclidian geometry, is the reason why the Minkowski space is called pseudo-Euclidian.) It is also straightforward to use Eqs. (51) and (54) to check that an evident generalization of the norm, the scalar product of two arbitrary 4-vectors,
Scalar 4-productA0B0−3∑j=1AjBj,
is also Lorentz-invariant.
Now consider the 4-vector corresponding to a small interval between two close world events:
{dx0,dx1,dx2,dx3}={cdt,dr};
its norm,
Interval(ds)2≡dx20−3∑j=1dx2j=c2(dt)2−(dr)2,
is of course also Lorentz-invariant. Since the speed of any particle (or signal) cannot be larger than c, for any pair of world events that are in a causal relation with each other, (dr)2 cannot be larger than (cdt)2, i.e. such time-like interval (ds)2 cannot be negative. The 4D surface separating such intervals from space-like intervals (ds)2<0 is called the light cone (Fig. 9).

Now let us assume that two close world events happen with the same particle that moves with velocity u. Then in the frame moving with the particle (v = u), on the right-hand side of Eq. (58) the last term equals zero, while the involved time is the proper one, so that
ds=cdτ,
where dτ is the proper time interval. But according to Eq. (21), this means that we can write
dτ=dtγ,
where dt is the time interval in an arbitrary (besides being inertial) reference frame, while
β≡uc and γ≡1(1−β2)1/2=1(1−u2/c2)1/2
are the parameters (17) corresponding to the particle’s velocity ( u) in that frame, so that ds=cdt/γ.24
Let us use Eq. (60) to explore whether a 4-vector may be formed using the spatial components of the particle’s velocity
u={dxdt,dydt,dzdt}.
Here we have a slight problem: as Eqs. (22) show, these components do not obey the Lorentz transform. However, let us use dτ≡dt/γ, the proper time interval of the particle, to form the following string:
{dx0dτ,dx1dτ,dx2dτ,dx3dτ}≡γ{c,dxdt,dydt,dzdt}≡γ{c,u}.4-velocity
As it follows from the comparison of the middle form of this expression with Eq. (48), since the time-space vector obeys the Lorentz transform, and τ is Lorentz invariant, the string (63) is a legitimate 4-vector; it is called the 4-velocity of the particle.
Now we are well equipped to proceed to relativistic dynamics. Let us start with such basic notions as the momentum p and the energy E-so far, for a free particle.25 Perhaps the most elegant way to “derive” (or rather guess26) the expressions for p and E as functions of the particle’s velocity u, is based on analytical mechanics. Due to the conservation of v, the trajectory of a free particle in the 4D Minkowski space {ct,r} is always a straight line. Hence, from the Hamilton principle,27 we may expect its action S, between points 1 and 2, to be a linear function of the space-time interval (59):
Free particle: actionS=α∫21ds≡αc∫21dτ≡αc∫t2t1dtγ,
where α is some constant. On the other hand, in analytical mechanics, the action is defined as
S≡∫t2t1Ldt,
where L is particle’s Lagrangian function.28 Comparing these two expressions, we get
L=αcγ≡αc(1−u2c2)1/2.
In the non-relativistic limit (u<<c), this function tends to
L≈αc(1−u22c2)=αc−αu22c.
To correspond to the Newtonian mechanics,29 the last (velocity-dependent) term should equal mu2/2. From here we find α=−mc, so that, finally,
Free particle: Lagrangian functionL=−mc2(1−u2c2)1/2≡−mc2γ.
Now we can find the Cartesian components pj of the particle’s momentum as the generalized momenta corresponding to the corresponding components rj(j=1,2,3) of the 3D radius-vector r:30
pj=∂L∂˙rj≡∂L∂uj=−mc2∂∂uj(1−u21+u22+u23c2)1/2=muj(1−u2/c2)1/2≡mγuj.
Thus for the 3D vector of momentum, we can write the result in the same form as in non-relativistic mechanics,
p=mγu≡Mu,Relativistic momentum
using the reference-frame-dependent scalar M (called the relativistic mass) defined as
M≡mγ=m(1−u2/c2)1/2≥m,Relativistic mass
m being the non-relativistic mass of the particle. (More often, m is called the rest mass, because in the reference frame in that the particle rests, Eq. (71) yields M=m.)
Next, let us return to analytical mechanics to calculate the particle’s energy E (which for a free particle coincides with its Hamiltonian function H):31
E=H=3∑j=1pjuj−L=p⋅u−L=mu2(1−u2/c2)1/2+mc2(1−u2c2)1/2≡mc2(1−u2/c2)1/2.E=Mc2
which expresses the relation between the free particle’s mass and its energy.32 In the non-relativistic limit, it reduces to
E=mc2(1−u2/c2)1/2≈mc2(1+u22c2)=mc2+mu22,
the first term mc2 being called the rest energy of a particle.
Now let us consider the following string of 4 scalars:
{Ec,p1,p2,p3}≡{Ec,p}.4-vector of energy-momentum
Using Eqs. (70) and (73) to represent this expression as
{Ec,p}=mγ{c,u},
and comparing the result with Eq. (63), we immediately see that, since m is a Lorentz-invariant constant, this string is a legitimate 4-vector of energy-momentum. As a result, its norm,
(Ec)2−p2,
is Lorentz-invariant, and in particular, has to be equal to the norm in the particle-bound frame. But in that frame, p=0, and, according to Eq. (73), E=mc2, and so that the norm is just
(Ec)2=(mc2c)2≡(mc)2,
so that in an arbitrary frame
(Ec)2−p2=(mc)2.
This very important relation33 between the relativistic energy and momentum (valid for free particles only!) is usually represented in the form34
Free particle: energyE2=(mc2)2+(pc)2.
According to Eq. (70), in the so-called ultra-relativistic limit u→c, p tends to infinity, while mc2 stays constant so that pc/mc2→∞. As follows from Eq. (78), in this limit E≈pc. Though the above discussion was for particles with finite m, the 4-vector formalism allows us to consider compact objects with zero rest mass as ultra-relativistic particles for which the above energy-to-moment relation,
E=pc, for m=0,
is exact. Quantum electrodynamics35 tells us that under certain conditions, the electromagnetic field quanta (photons) may be also considered as such massless particles with momentum p=ℏk. Plugging (the modulus of) the last relation into Eq. (78), for the photon’s energy we get E=pc=ℏkc=ℏω. Please note again that according to Eq. (73), the relativistic mass of a photon is not equal to zero: M=E/c2=ℏω/c2, so that the term “massless particle” has a limited meaning: m=0. For example, the relativistic mass of an optical phonon is of the order of 10−36 kg. On the human scale, this is not too much, but still a noticeable (approximately one-millionth) part of the rest mass me of an electron.
The fundamental relations (70) and (73) have been repeatedly verified in numerous particle collision experiments, in which the total energy and momentum of a system of particles are conserved – at the same conditions as in non-relativistic dynamics. (For the momentum, this is the absence of external forces, and for the energy, the elasticity of particle interactions – in other words, the absence of alternative channels of energy escape.) Of course, generally only the total energy of the system is conserved, including the potential energy of particle interactions. However, at typical high-energy particle collisions, the potential energy vanishes so rapidly with the distance between them that we can
use the momentum and energy conservation laws using Eq. (73).
As an example, let us calculate the minimum energy Emin of a proton (pa), necessary for the well-known high-energy reaction that generates a new proton-antiproton pair, pa+pb→p+p+p+¯p, provided that before the collision, the proton pb had been at rest in the lab frame. This minimum corresponds to the vanishing relative velocity of the reaction products, i.e. their motion with virtually the same velocity (ufin), as seen from the lab frame – see Fig. 10.

Due to the momentum conservation, this velocity should have the same direction as the initial velocity (umin) of proton pa. This is why two scalar equations: for energy conservation,
mc2(1−u2min/c2)1/2+mc2=4mc2(1−u2fin/c2)1/2,
and for momentum conservation,
mu(1−u2min/c2)1/2+0=4mufin(1−u2fin/c2)1/2,
are sufficient to find both umin and ufin. After a conceptually simple but technically somewhat tedious solution of this system of two nonlinear equations, we get
umin=4√37c,ufin=√32c.
Finally, we can use Eq. (73) to calculate the required energy; the result is Emin=7mc2. (Note that at this threshold, only 2mc2 of the kinetic energy Tmin=Emin−mc2=6mc2 of the initial moving particle, go into the “useful” proton-antiproton pair production.) The proton’s rest mass, mp≈1.67×10−27 kg, corresponds to mpc2≈1.502×10−10 J≈0.938GeV, so that Emin≈6.57GeV.
The second, more intelligent way to solve the same problem is to use the center-of-mass (c.o.m.) reference frame that, in relativity, is defined as the frame in that the total momentum of the system vanishes.36 In this frame, at E=Emin, the velocity and momenta of all reaction products are vanishing, while the velocities of protons pa and pb before the collision are equal and opposite, with an initially unknown magnitude u′. Hence the energy conservation law becomes
2mc2(1−u′2/c2)1/2=4mc2,
readily giving u′=(√3/2)c. (This is of course the same result as Eq. (81) gives for ufin .) Now we can use the fact that the velocity of the proton pa in the c.o.m. frame is (−u′), to find its lab-frame speed, using the velocity transform (25):
umin=2u′1+u′2/c2.
With the above result for u′, this relation gives the same result as the first method, umin=(4√3/7)c, but in a simpler way.
Reference
23 Such vectors are said to reside in so-called 4D Minkowski spaces – called after Hermann Minkowski who was the first one to recast (in 1907) the special relativity relations in a form in which the spatial coordinates and time (or rather ct) are treated on an equal footing.
24 I have opted against using special indices (e.g., βu,γu) to distinguish Eqs. (17) and (61) here and below, in a hope that the suitable velocity (of either a reference frame or a particle) will be always clear from the context.
25 I am sorry for using, just as in Sec. 6.3, the same traditional notation (p) for the particle’s momentum as had been used earlier for the electric dipole moment. However, since the latter notion will be virtually unused in the balance of this course, this may hardly lead to confusion.
26 Indeed, such a derivation uses additional assumptions, however natural (such as the Lorentz-invariance of S), i.e. it can hardly be considered as a real proof of the final results, so that they require experimental confirmation. Fortunately, such confirmations have been numerous – see below.
27 See, e.g., CM Sec. 10.3.
28 See, e.g., CM Sec. 2.1.
29 See, e.g., CM Eq. (2.19b).
30 See, e.g., CM Sec. 2.3, in particular Eq. (2.31).
31 See, e.g., CM Eq. (2.32).
32 Let me hope that the reader understands that all the layman talk about the “mass to energy conversion” is only valid in a very limited sense of the word. While the Einstein relation (73) does allow the conversion of “massive” particles (with m≠0) into particles with m=0, such as photons, each of the latter particles also has a non-zero relativistic mass M, and simultaneously the energy E related to this M by Eq. (73).
33 Please note one more simple and useful relation following from Eqs. (70) and (73): p=(E/c2)u.
34 It may be tempting to interpret this relation as the perpendicular-vector-like addition of the rest energy mc2 and the “kinetic energy” pc, but from the point of view of the total energy conservation (see below), a better definition of the kinetic energy is T(u)≡E(u)−E(0).
35 It is briefly reviewed in QM Chapter 9.
36 Note that according to this definition, the c.o.m.’s radius-vector is R=ΣkMkrk/ΣkMk≡Σkγkmkrk/Σkγkmk, i.e. is generally different from the well-known non-relativistic expression R=Σkmkrk/Σkmk.