Skip to main content
Physics LibreTexts

9.6: Dirac’s Theory

  • Page ID
    57581
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    The real breakthrough toward the quantum relativistic theory of electrons (and other spin- \(1 / 2\) fermions) was achieved in 1928 by P. A. M. Dirac. For that time, the structure of his theory was highly nontrivial. Namely, while formally preserving, in the coordinate representation, the same Schrödingerpicture equation of quantum dynamics as in the non-relativistic quantum mechanics, \({ }^{51}\) \[i \hbar \frac{\partial \Psi}{\partial t}=\hat{H} \Psi,\] it postulates that the wavefunction \(\Psi\) it describes is not a scalar complex function of time and coordinates, but a four-component column-vector (sometimes called the bispinor) of such functions, its Hermitian-conjugate bispinor \(\Psi^{\dagger}\) being a 4-component row-vector of their complex conjugates: \[\Psi=\left(\begin{array}{c} \Psi_{1}(\mathbf{r}, t) \\ \Psi_{2}(\mathbf{r}, t) \\ \Psi_{3}(\mathbf{r}, t) \\ \Psi_{4}(\mathbf{r}, t) \end{array}\right), \quad \Psi^{\dagger}=\left(\Psi_{1}^{*}(\mathbf{r}, t), \quad \Psi_{2}^{*}(\mathbf{r}, t), \quad \Psi_{3}^{*}(\mathbf{r}, t), \quad \Psi_{4}^{*}(\mathbf{r}, t)\right)\] and that the Hamiltonian participating in Eq. (95) is a \(4 \times 4\) matrix defined in the Hilbert space of bispinors \(\Psi\). For a free particle, the postulated Hamiltonian looks amazingly simple: 52

    \[\hat{H}=c \hat{\boldsymbol{\alpha}} \cdot \hat{\mathbf{p}}+\hat{\beta} m c^{2} .\] where \(\hat{\mathbf{p}}=-i \hbar \nabla\) is the same \(3 \mathrm{D}\) vector operator of momentum as in the non-relativistic case, while the operators \(\hat{\boldsymbol{\alpha}}\) and \(\hat{\beta}\) may be represented in the following shorthand \(2 \times 2\) form: \[\hat{\boldsymbol{\alpha}} \equiv\left(\begin{array}{cc} \hat{0} & \hat{\boldsymbol{\sigma}} \\ \hat{\boldsymbol{\sigma}} & \hat{0} \end{array}\right), \quad \hat{\beta} \equiv\left(\begin{array}{cc} \hat{I} & \hat{0} \\ \hat{0} & -\hat{I} \end{array}\right)\] The operator \(\hat{\boldsymbol{\alpha}}\), composed of the Pauli vector operators \(\hat{\boldsymbol{\sigma}}\), is also a vector in the usual \(3 \mathrm{D}\) space, with each of its 3 Cartesian components being a \(4 \times 4\) matrix. The particular form of the \(2 \times 2\) matrices corresponding to the operators \(\hat{\boldsymbol{\sigma}}\) and \(\hat{I}\) in Eq. (98a) depends on the basis selected for the spin state representation; for example, in the standard \(z\)-basis, in which the Cartesian components of \(\hat{\boldsymbol{\sigma}}\) are represented by the Pauli matrices (4.105), the \(4 \times 4\) matrix form of Eq. (98a) is \[\alpha_{x}=\left(\begin{array}{cccc} 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \end{array}\right), \quad \alpha_{y}=\left(\begin{array}{cccc} 0 & 0 & 0 & -i \\ 0 & 0 & i & 0 \\ 0 & -i & 0 & 0 \\ i & 0 & 0 & 0 \end{array}\right), \quad \alpha_{z}=\left(\begin{array}{cccc} 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & -1 \\ 1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \end{array}\right), \quad \beta=\left(\begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \end{array}\right) .(9.98 \mathrm{~b})\] It is straightforward to use Eqs. (98) to verify that the matrices \(\alpha_{x}, \alpha_{y}, \alpha_{z}\) and \(\beta\) satisfy the following relations: \[\begin{gathered} \alpha_{x}^{2}=\alpha_{y}^{2}=\alpha_{z}^{2}=\beta^{2}=\mathrm{I}, \\ \alpha_{x} \alpha_{y}+\alpha_{y} \alpha_{x}=\alpha_{y} \alpha_{z}+\alpha_{z} \alpha_{y}=\alpha_{z} \alpha_{x}+\alpha_{x} \alpha_{z}=\alpha_{x} \beta+\beta \alpha_{x}=\alpha_{y} \beta+\beta \alpha_{y}=\alpha_{z} \beta+\beta \alpha_{z}=0, \end{gathered}\] i.e. anticommute.

    Using these commutation relations, and acting essentially as in Sec. 1.4, it is straightforward to show that any solution to the Dirac equation obeys the probability conservation law, i.e. the continuity equation (1.52), with the probability density: \[w=\Psi^{\dagger} \Psi\] and the probability current, \[\mathbf{j}=\Psi^{\dagger} c \hat{\boldsymbol{\alpha}} \Psi\] looking almost as in the non-relativistic wave mechanics - cf. Eqs. (1.22) and (1.47). Note, however, the Hermitian conjugation used in these formulas instead of the complex conjugation, to form the scalars \(w\), \(j_{x}, j_{y}\), and \(j_{z}\) from the 4-component state vectors (96).

    This close similarity is extended to the fundamental, plane-wave solutions of the Dirac equations is free space. Indeed, plugging such solution, in the form \[\Psi=\mathrm{u} e^{i(\mathbf{k} \cdot \mathbf{r}-\omega t)} \equiv\left(\begin{array}{l} u_{1} \\ u_{2} \\ u_{3} \\ u_{4} \end{array}\right) e^{i(\mathbf{k} \cdot \mathbf{r}-\omega t)}\] into Eqs. (95) and (97), we see that they are indeed satisfied, provided that a system of four coupled, linear algebraic equations for four complex \(c\)-number amplitudes \(u_{1,2,3,4}\) is satisfied. The condition of its consistency yields the same dispersion relation (87), i.e. the same two-branch diagram shown in Fig. 6, as follows from the Klein-Gordon equation. The difference is that plugging each value of \(\omega\), given by Eq. (87), back into the system of the linear equations for four amplitudes \(u\), we get two solutions for their vector \(\mathbf{u} \equiv\left(u_{1}, u_{2}, u_{3}, u_{4}\right)\) for each of the two energy branches - see Fig. 6 again. In the standard \(z\) basis of spin operators, they may be represented as follows:

    \[\ \text { for } E=E_{+}>0: \quad \mathrm{u}_{+\uparrow}=c_{+\uparrow}\left(\begin{array}{c}
    1 \\ 0\\
    \frac{c p_{z}}{E_{+}+m c^{2}} \\
    \frac{c p_{+}}{E_{+}+m c^{2}}
    \end{array}\right), \quad \mathrm{u}_{+\downarrow}=c_{+\downarrow}\left(\begin{array}{c}
    0 \\ 1\\
    \frac{c p_{-}}{E_{+}+m c^{2}} \\
    \frac{-c p_{z}}{E_{+}+m c^{2}}
    \end{array}\right),\]

    \[\ \text { for } E=E_{-}<0: \quad u_{-\uparrow}=c_{-\uparrow}\left(\begin{array}{c}
    \frac{c p_{z}}{E_{-}-m c^{2}} \\
    \frac{c p_{+}}{E_{-}-m c^{2}} \\
    1 \\
    0
    \end{array}\right), \quad u_{-\downarrow}=c_{-\downarrow}\left(\begin{array}{c}
    \frac{c p_{-}}{E_{-}-m c^{2}} \\
    \frac{-c p_{z}}{E_{-}-m c^{2}} \\
    0 \\
    1
    \end{array}\right),\]

    where \(p_{\pm} \equiv p_{x} \pm i p_{y}\), and \(c_{\pm}\)are normalization coefficients.

    The simplest interpretation of these solutions is that Eq. (103), with the vectors \(\mathbf{u}_{+}\)given by Eq. (104a), represents a spin-1/2 particle (say, an electron), while with the vectors \(\mathbf{u}_{-}\)given by Eq. (104b), it represents an antiparticle (a positron), and the two solutions for each particle, indexed with opposite arrows, correspond to two possible directions of the spin-1/2, \(\sigma_{z}=\pm 1\), i.e. \(S_{z}=\pm \hbar / 2\). This interpretation is indeed solid in the non-relativistic limit, when two last components of the vector (104a), and two first components of the vector (104b) are negligibly small: \[\mathrm{u}_{+\uparrow} \rightarrow\left(\begin{array}{c} 1 \\ 0 \\ 0 \\ 0 \end{array}\right), \quad \mathrm{u}_{+\downarrow} \rightarrow\left(\begin{array}{c} 0 \\ 1 \\ 0 \\ 0 \end{array}\right), \quad \mathrm{u}_{-\uparrow} \rightarrow\left(\begin{array}{l} 0 \\ 0 \\ 1 \\ 0 \end{array}\right), \quad \mathrm{u}_{-\downarrow} \rightarrow\left(\begin{array}{l} 0 \\ 0 \\ 0 \\ 1 \end{array}\right), \quad \text { for } \frac{p_{x, y, z}}{m c} \rightarrow 0\] However, at arbitrary energies, the physical picture is more complex. To show this, let us use the Dirac equation to calculate the Heisenberg-picture law of time evolution of the operator of some Cartesian component of the orbital angular momentum \(\mathbf{L} \equiv \mathbf{r} \times \mathbf{p}\), for example of \(L_{x}=y p_{z}-z p_{y}\), taking into account that the Dirac operators (98a) commute with those of \(\mathbf{r}\) and \(\mathbf{p}\), and also the Heisenberg commutation relations (2.14): \[i \hbar \frac{\partial \hat{L}_{x}}{\partial t}=\left[\hat{L}_{x}, \hat{H}\right]=c \hat{\boldsymbol{\alpha}} \cdot\left[\left(\hat{y} \hat{p}_{z}-\hat{z} \hat{p}_{y}\right), \hat{\mathbf{p}}\right]=-i \hbar c\left(\hat{\alpha}_{z} \hat{p}_{y}-\hat{\alpha}_{y} \hat{p}_{z}\right)\] with similar relations for two other Cartesian components. Since the right-hand side of these equations is different from zero, the orbital momentum is generally not conserved - even for a free particle! Let us, however, consider the following vector operator, \[\hat{\mathbf{S}} \equiv \frac{\hbar}{2}\left(\begin{array}{ll} \hat{\boldsymbol{\sigma}} & \hat{0} \\ \hat{0} & \hat{\boldsymbol{\sigma}} \end{array}\right) .\] According to Eqs. (4.105), its Cartesian components, in the \(z\)-basis, are represented by \(4 \times 4\) matrices \[\mathrm{S}_{x}=\frac{\hbar}{2}\left(\begin{array}{cccc} 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \end{array}\right), \quad \mathrm{S}_{y}=\frac{\hbar}{2}\left(\begin{array}{cccc} 0 & -i & 0 & 0 \\ i & 0 & 0 & 0 \\ 0 & 0 & 0 & -i \\ 0 & 0 & i & 0 \end{array}\right), \quad \mathrm{S}_{z}=\frac{\hbar}{2}\left(\begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & -1 \end{array}\right) \text {. }\] Let us calculate the Heisenberg-picture law of time evolution of these components, for example \[i \hbar \frac{\partial \hat{S}_{x}}{\partial t}=\left[\hat{S}_{x}, \hat{H}\right]=c\left[\hat{S}_{x},\left(\hat{\alpha}_{x} \hat{p}_{x}+\hat{\alpha}_{y} \hat{p}_{y}+\hat{\alpha}_{z} \hat{p}_{z}\right)\right]\] A direct calculation of the commutators of the matrices (98) and (107) yields \[\left[\hat{S}_{x}, \hat{\alpha}_{x}\right]=0,\left[\hat{S}_{x}, \hat{\alpha}_{y}\right]=i \hbar \hat{\alpha}_{z},\left[\hat{S}_{x}, \hat{\alpha}_{z}\right]=-i \hbar \hat{\alpha}_{y}\] so that we finally get \[i \hbar \frac{\partial \hat{S}_{x}}{\partial t}=i \hbar c\left(\hat{\alpha}_{z} \hat{p}_{y}-\hat{\alpha}_{y} \hat{p}_{z}\right),\] with similar expressions for the other two components of the operator. Comparing this result with Eq. (106), we see that any Cartesian component of the operator defined similarly to Eq. (5.170), \[\hat{\mathbf{J}} \equiv \hat{\mathbf{L}}+\hat{\mathbf{S}}\] is an integral of motion, \({ }^{53}\) so that this operator may be interpreted as the one representing the total angular momentum of the particle. Hence, the operator (107) may be interpreted as the spin operator of a spin- \(1 / 2\) particle (e.g., electron). As it follows from the last of Eq. (107b), in the non-relativistic limit the columns (105) represent the eigenkets of the \(z\)-component of that operator, with eigenstates \(S_{z}=\pm \hbar / 2\), with the sign corresponding to on the arrow index. So, the Dirac theory provides a justification for spin\(1 / 2\) - or, somewhat more humbly, replaces the Pauli Hamiltonian postulate (4.163) with that of a simpler (and hence more plausible), Lorentz-invariant Hamiltonian (97).

    Note, however, that this simple interpretation, fully separating a particle from its antiparticle, is not valid for the exact solutions (103)-(104), so that generally the eigenstates of the Dirac Hamiltonian are certain linear (coherent) superpositions of the components describing the particle and its antiparticle - each with both directions of spin. This fact leads to several interesting effects, including the so-called Klien paradox at the reflection of a relativistic electron from a potential barrier. \({ }^{54}\)


    \({ }^{51}\) After the "naturally-relativistic" form of the Klein-Gordon equation (84), this apparent return to the nonrelativistic Schrödinger equation may look very counter-intuitive. However, it becomes a bit less surprising taking into account the fact (whose proof is left for the reader’s exercise) that Eq. (84) may be also recast into the form (95) for a two-component column-vector \(\Psi\) (sometimes called spinor), with a Hamiltonian which may be represented by a \(2 \times 2\) matrix \(-\) and hence expressed via the Pauli matrices (4.105) and the identity matrix I.

    \({ }^{52}\) Moreover, if the time derivative participating in Eq. (95), and the three coordinate derivatives participating (via the momentum operator) in Eq. (97), are merged into one 4-vector operator \(\partial / \partial x_{k} \equiv\{\nabla, \partial / \partial(c t)\}\), the Dirac equation (95) may be rewritten in an even simpler, manifestly Lorentz-invariant 4-vector form (with the implied summation over the repeated index \(k=1, \ldots, 4\) - see, e.g., EM Sec. 9.4): \[\left(\hat{\gamma}_{k} \frac{\partial}{\partial x_{k}}+\mu\right) \Psi=0, \quad \text { where } \hat{\gamma} \equiv\left\{\hat{\gamma}_{1}, \hat{\gamma}_{2}, \hat{\gamma}_{3}\right\}=\left(\begin{array}{cc} 0 & -i \hat{\boldsymbol{\sigma}} \\ i \hat{\boldsymbol{\sigma}} & 0 \end{array}\right), \quad \hat{\gamma}_{4}=\hat{\beta}\] where \(\mu \equiv m c / \hbar\) - just as in Eq. (84). Note also that, very counter-intuitively, the Dirac Hamiltonian (97) is linear in the momentum, while the non-relativistic Hamiltonian of a particle, as well as the relativistic Schrödinger equation, are quadratic in p. In my humble opinion, the Dirac theory (including the concept of antiparticles it has inspired) may compete for the title of the most revolutionary theoretical idea in physics of all times, despite such strong contenders as Newton’s laws, Maxwell’s equations, Gibbs’ statistical distribution, Bohr’s theory of the hydrogen atom, and Einstein’s general relativity.

    \({ }^{53}\) It is straightforward to show that this result remains valid for a particle in any central field \(U(r)\).

    \({ }^{54}\) See, e.g., A. Calogeracos and N. Dombey, Contemp. Phys. 40, 313 (1999).


    This page titled 9.6: Dirac’s Theory is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Konstantin K. Likharev via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.