Skip to main content
Physics LibreTexts

11.2: The Dirac Equation

Consider the motion of an electron in the absence of an electromagnetic field. In classical relativity, electron energy, $ E$ , is related to electron momentum, $ {\bf p}$ , according to the well-known formula


$\displaystyle \frac{E}{c}=(p^2+ m_e^2\,c^2)^{1/2},$ (1112)



where $ m_e$ is the electron rest mass. The quantum mechanical equivalent of this expression is the wave equation


$\displaystyle \left[p^0 - (p^1\,p^1+p^2\,p^2+p^3\,p^3+m_e^2\,c^2)^{1/2}\right]\psi = 0,$ (1113)



where the $ p$ 's are interpreted as differential operators according to Equation (1111). The above equation takes into account the correct relativistic relation between electron energy and momentum, but is nevertheless unsatisfactory from the point of view of relativistic theory, because it is highly asymmetric between $ p^0$ and the other $ p$ 's. This makes the equation difficult to generalize, in a manifestly Lorentz invariant manner, in the presence of an electromagnetic field. We must therefore look for a new equation.

If we multiply the wave equation (1113) by the operator $ \left[p^0 +(p^1\,p^1+p^2\,p^2+p^3\,p^3+m_e^2\,c^2)^{1/2}\right]$ then we obtain


$\displaystyle \left(p^0\,p^0- p^1\,p^1-p^2\,p^2-p^3\,p^3+m_e^2\,c^2\right)\psi = \left(p^{\,\mu}\,p_\mu+m_e^2\,c^2\right)\psi.$ (1114)



This equation is manifestly Lorentz invariant, and, therefore, forms a more convenient starting point for relativistic quantum mechanics. Note, however, that Equation (1114) is not entirely equivalent to Equation (1113), because, although every solution of (1113) is also a solution of (1114), the converse is not true. In fact, only those solutions of (1114) belonging to positive values of $ p^0$ are also solutions of (1113).

The wave equation (1114) is quadratic in $ p^0$ , and is thus not of the form required by the laws of quantum theory. (Recall that we showed, from general principles, in Chapter 3, that the time evolution equation for the wavefunction should be linear in the operator $ \partial/\partial t$ , and, hence, in $ p^0$ .) We, therefore, seek a wave equation that is equivalent to (1114), but is linear in $ p^0$ . In order to ensure that this equation transforms in a simple way under a Lorentz transformation, we shall require it to be rational and linear in $ p^1$ , $ p^2$ , $ p^3$ , as well as $ p^0$ . We are thus lead to a wave equation of the form


$\displaystyle \left(p^0 - \alpha_1\,p^1-\alpha_2\,p^2-\alpha_3\,p^3-\beta\,m_e\,c\right)\psi = 0,$ (1115)



where the $ \alpha$ 's and $ \beta$ are dimensionless, and independent of the $ p$ 's. Moreover, according to standard relativity, because we are considering the case of no electromagnetic field, all points in space-time must be equivalent. Hence, the $ \alpha$ 's and $ \beta$ must also be independent of the $ x$ 's. This implies that the $ \alpha$ 's and $ \beta$ commute with the $ p$ 's and the $ x$ 's. We, therefore, deduce that the $ \alpha$ 's and $ \beta$ describe an internal degree of freedom that is independent of space-time coordinates. Actually, we shall show later that these operators are related to electron spin.

Multiplying (1115) by the operator $ p^0 +\alpha_1\,p^1+\alpha_2\,p^2+\alpha_3\,p^3+\beta\,m_e\,c$ , we obtain


$\displaystyle \left[p^0\,p^0-\frac{1}{2}\sum_{i,j=1,3}\{\alpha_i,\alpha_j\}\,p^...
...m_{i=1,3}\{\alpha_i,\beta\}\,p^i\,m_e\,c-\beta^{\,2}\,m_e^2\,c^2\right]\psi= 0,$ (1116)



where $ \{a,b\}\equiv a\,b+b\,a$ . This equation is equivalent to (1114) provided that


$\displaystyle \{\alpha_i,\alpha_j\}$ $\displaystyle =2\,\delta_{ij},$ (1117)
$\displaystyle \{\alpha_i,\beta\}$ $\displaystyle = 0,$ (1118)
$\displaystyle \beta^{\,2}$ $\displaystyle = 1,$ (1119)



for $ i,j=1,3$ . It is helpful to define the $ \gamma^{\,\mu}$ , for $ \mu=0,3$ , where


$\displaystyle \beta$ $\displaystyle =\gamma^0,$ (1120)
$\displaystyle \alpha_i$ $\displaystyle = \gamma^0\,\gamma^i,$ (1121)



for $ i=1,3$ . Equations (1117)-(1119) can then be shown to reduce to


$\displaystyle \{\gamma^{\,\mu},\gamma^\nu\} = 2\,g^{\,\mu\,\nu}.$ (1122)



One way of satisfying the above anti-commutation relations is to represent the operators $ \gamma^{\,\mu}$ as matrices. However, it turns out that the smallest dimension in which the $ \gamma^{\,\mu}$ can be realized is four. In fact, it is easily verified that the $ 4\times 4$ matrices


$\displaystyle \gamma^0$ $\displaystyle = \left(\begin{array}{rr} 1& 0\\ [0.5ex]0&-1\end{array}\right),$ (1123)
$\displaystyle \gamma^i$ $\displaystyle = \left(\begin{array}{rr} 0& \sigma_i\\ [0.5ex]-\sigma_i&0\end{array}\right),$ (1124)



for $ i=1,3$ , satisfy the appropriate anti-commutation relations. Here, 0 and $ 1$ denote $ 2\times 2$ null and identity matrices, respectively, whereas the $ \sigma_i$ represent the $ 2\times 2$ Pauli matrices introduced in Section 5.7. It follows from (1120) and (1121) that


$\displaystyle \beta$ $\displaystyle = \left(\begin{array}{rr} 1& 0\\ [0.5ex]0&-1\end{array}\right),$ (1125)
$\displaystyle \alpha_i$ $\displaystyle = \left(\begin{array}{rr} 0& \sigma_i\\ [0.5ex]\sigma_i&0\end{array}\right).$ (1126)



Note that $ \gamma^0$ , $ \beta$ , and the $ \alpha_i$ , are all Hermitian matrices, whereas the $ \gamma^{\,\mu}$ , for $ \mu=1,3$ , are anti-Hermitian. However, the matrices $ \gamma^0\,\gamma^{\,\mu}$ , for $ \mu=0,3$ , are Hermitian. Moreover, it is easily demonstrated that


$\displaystyle \gamma^{\,\mu\,\dag } = \gamma^0\,\gamma^{\,\mu}\,\gamma^0,$ (1127)



for $ \mu=0,3$ .

Equation (1115) can be written in the form


$\displaystyle (\gamma^{\,\mu}\,p_\mu - m_e\,c)\,\psi = ({\rm i}\,\hbar\,\gamma^{\,\mu}\,\partial_\mu-m_e\,c)\,\psi=0,$ (1128)



where $ \partial_\mu\equiv \partial/\partial x^{\,\mu}$ . Alternatively, we can write


$\displaystyle {\rm i}\,\hbar\,\frac{\partial\psi}{\partial t} = (c\,$$\displaystyle \mbox{\boldmath$\alpha$}$$\displaystyle \cdot{\bf p} + \beta\,m_e\,c^2)\,\psi,$ (1129)



where $ {\bf p}=(p_x,p_y,p_z)=(p^1,p^2,p^3)$ , and $ \alpha$ is the vector of the $ \alpha_i$ matrices. The previous expression is known as the Dirac equation. Incidentally, it is clear that, corresponding to the four rows and columns of the $ \gamma^{\,\mu}$ matrices, the wavefunction $ \psi$ must take the form of a $ 4\times 1$ column matrix, each element of which is, in general, a function of the $ x^{\,\mu}$ . We saw in Section 5.7 that the spin of the electron requires the wavefunction to have two components. The reason our present theory requires the wavefunction to have four components is because the wave equation (1114) has twice as many solutions as it ought to have, half of them corresponding to negative energy states.

We can incorporate an electromagnetic field into the above formalism by means of the standard prescription $ E\rightarrow E+e\,\phi$ , and $ p^{\,i}\rightarrow p^{\,i} +e\,A^{\,i}$ , where $ e$ is the magnitude of the electron charge, $ \phi$ the scalar potential, and $ {\bf A}$ the vector potential. This prescription can be expressed in the Lorentz invariant form


$\displaystyle p^{\,\mu} \rightarrow p^{\,\mu} + \frac{e}{c}\,{\mit\Phi}^{\,\mu},$ (1130)



where $ {\mit\Phi}^{\,\mu} = (\phi,c\,{\bf A})$ is the potential 4-vector. Thus, Equation (1128) becomes


$\displaystyle \left[\gamma^{\,\mu}\left(p_\mu+\frac{e}{c}\,{\mit\Phi}_\mu\right...
...}\,\hbar\,\partial_\mu+\frac{e}{c}\,{\mit\Phi}_\mu\right)-m_e\,c\right]\psi=0 ,$ (1131)



whereas Equation (1129) generalizes to


$\displaystyle {\rm i}\,\hbar\,\frac{\partial\psi}{\partial t} =\left[-e\,\phi +...
...ox{\boldmath$\alpha$}\cdot({\bf p}+e\,{\bf A})+ \beta\,m_e\,c^2\right]\psi = 0.$ (1132)



If we write the wavefunction in the spinor form


$\displaystyle \psi= \left(\begin{array}{c}\psi_0\\ [0.5ex]\psi_1\\ [0.5ex]\psi_2\\ [0.5ex]\psi_3\end{array}\right)$ (1133)



then the Hermitian conjugate of Equation (1132) becomes


$\displaystyle -{\rm i}\,\hbar\,\frac{\partial\psi^\dag }{\partial t} =\psi^\dag...
...\boldmath$\alpha$}\cdot({\bf p} +{\bf e}\,{\bf A})+ \beta\,m_e\,c^2\right] = 0,$ (1134)





$\displaystyle \psi^\dag = \left(\psi_0^{\,\ast}, \psi_1^{\,\ast},\psi_2^{\,\ast},\psi_3^{\,\ast}\right),$ (1135)



Here, use has been made of the fact that the $ \alpha_i$ and $ \beta$ are Hermitian matrices that commute with the $ p^i$ and $ A^i$ .

It follows from $ \psi^\dag\,\gamma^0$ times Equation (1131) that


$\displaystyle \psi^\dag\left[\gamma^0\,\gamma^{\,\mu}\left({\rm i}\,\hbar\,\partial_\mu-\frac{e}{c}\,{\mit\Phi}_\mu\right)-\gamma^0\,m_e\,c\right]\psi=0.$ (1136)



The Hermitian conjugate of this expression is


$\displaystyle \psi^\dag\left[\left(-{\rm i}\,\hbar\,\partial_\mu- \frac{e}{c}\,{\mit\Phi}_\mu\right)\gamma^0\,\gamma^{\,\mu}-m_e\,c\,\gamma^0\,\right]\psi=0,$ (1137)



where $ \partial_\mu$ now acts backward on $ \psi^\dag$ , and use has been made of the fact that the matrices $ \gamma^0\,\gamma^{\,\mu}$ and $ \gamma^0$ are Hermitian. Taking the difference between the previous two equation, we obtain


$\displaystyle \partial_\mu\,j^{\,\mu} = 0,$ (1138)





$\displaystyle j^{\,\mu} = c\,\psi^\dag\,\gamma^0\,\gamma^{\,\mu}\,\psi.$ (1139)



Writing $ j^{\,\mu} = (c\,\rho, {\bf j})$ , where


$\displaystyle \rho$ $\displaystyle = \psi^\dag\,\psi,$ (1140)
$\displaystyle j^{\,i}$ $\displaystyle = c\,\psi^\dag\,\gamma^0\,\gamma^i\,\psi = \psi^\dag\,c\,\alpha_i\,\psi,$ (1141)



Equation (1138) becomes


$\displaystyle \frac{\partial\rho}{\partial t} + \nabla\cdot{\bf j} = 0.$ (1142)



The above expression has the same form as the non-relativistic probability conservation equation (284). This suggests that we can interpret the positive definite real scalar field $ \rho({\bf x},t) = \vert\psi\vert^{\,2}$ as the relativistic probability density, and the vector field $ {\bf j}({\bf x},t)$ as the relativistic probability current. Integration of the above expression over all space, assuming that $ \vert\psi({\bf x},t)\vert\rightarrow 0$ as $ \vert{\bf x}\vert\rightarrow
\infty$ , yields


$\displaystyle \frac{d}{dt}\!\int d^3 x\,\,\rho({\bf x},t) = 0.$ (1143)



This ensures that if the wavefunction is properly normalized at time $ t=0$ , such that


$\displaystyle \int d^3 x\,\,\rho({\bf x},0) = 1,$ (1144)



then the wavefunction remains properly normalized at all subsequent times, as it evolves in accordance with the Dirac equation. In fact, if this were not the case then it would be impossible to interpret $ \rho$ as a probability density. Now, relativistic invariance demands that if the wavefunction is properly normalized in one particular inertial frame then it should be properly normalized in all inertial frames. This is the case provided that Equation (1138) is Lorentz invariant (i.e., if it has the property that if it holds in one inertial frame then it holds in all inertial frames), which is true as long as the $ j^{\,\mu}$ transform as the contravariant components of a 4-vector under Lorentz transformation (see Exercise 4).