# 4.1: Angular Momentum Operator Algebra

- Page ID
- 5593

# Preliminaries: Translation and Rotation Operators

As a warm up to analyzing how a wave function transforms under rotation, we review the effect of *linear translation* on a single particle wave function \(\psi(x)\). We have already seen an example of this: the coherent states of a simple harmonic oscillator discussed earlier were (at \(t=0\) ) identical to the ground state except that they were centered at some point displaced from the origin. In fact, the operator creating such a state from the ground state is a translation operator.

The *translation operator* \(T(a)\) is *defined* at that operator which when it acts on a wave function ket \(|\psi(x)\rangle\) gives the ket corresponding to that wave function moved over by \(a\), that is,

\[ T(a)|\psi(x)\rangle =|\psi(x-a)\rangle , \label{4.1.1}\]

so, for example, if \(\psi(x)\) is a wave function centered at the origin, \(T(a)\) moves it to be centered at the point \(a\).

We have written the wave function as a ket here to emphasize the parallels between this operation and some later ones, but it is simpler at this point to just work with the wave function as a function, so we will drop the ket bracket for now. The form of \(T(a)\) as an operator on a function is made evident by rewriting the Taylor series in operator form:

\[ \psi(x-a)=\psi(x)-a\dfrac{d}{dx}\psi(x)+\dfrac{a^2}{2!}\dfrac{d^2}{dx^2}\psi(x)-\dots =e^{-a\dfrac{d}{dx}}\psi(x)=T(a)\psi(x). \label{4.1.2}\]

Now for the quantum connection: the differential operator appearing in the exponential is in quantum mechanics proportional to the momentum operator ( \(\hat{p}=-i\hbar d/dx\) ) so the translation operator

\[ T(a)=e^{-ia\hat{p}/\hbar} . \label{4.1.3}\]

An important special case is that of an infinitesimal translation,

\[ T(\varepsilon)=e^{-i\varepsilon\hat{p}/\hbar} =1-i\varepsilon\hat{p}/\hbar . \label{4.1.4}\]

The momentum operator \(\hat{p}\) is said to be the *generator* of the translation.

*A note on possibly confusing notation*:

Shankar writes (page 281) \(T(\varepsilon)|x\rangle =|x+\varepsilon\rangle\). Here \(|x\rangle\) denotes a delta-function type wave function centered at \(x\). It might be better if he had written \(T(\varepsilon)|x_0\rangle =|x_0+\varepsilon\rangle\) , then we would see right away that this translates into the wave function transformation \(T(\varepsilon)\delta (x-x_0)=\delta (x-x_0-\varepsilon )\), the sign of \(\varepsilon\) now obviously consistent with our usage above.)

It is important to be clear about whether the *system* is being translated by \(a\), as we have done above or whether, alternately, the *coordinate axes* are being translated by \(a\), that latter would result in the *opposite* change in the wave function. Translating the coordinate axes, along with the apparatus and any external fields by \(-a\) relative to the wave function would of course give the same physics as translating the wave function by \(+a\). In fact, these two equivalent operations are analogous to the time development of a wave function being described either by a Schrödinger picture, in which the bras and kets change in time, but not the operators, and the Heisenberg picture in which the operators develop but the bras and kets do not change. To pursue this analogy a little further, in the “Heisenberg” case

\[ \hat{x}\to T^{-1}(\varepsilon)\hat{x}T(\varepsilon)=e^{i\varepsilon\hat{p}/\hbar} \hat{x}e^{-i\varepsilon\hat{p}/\hbar} =\hat{x}+i\varepsilon [\hat{p},\hat{x}]/\hbar =\hat{x}+\varepsilon \label{4.1.5}\]

and \(\hat{p}\) is unchanged since it commutes with the operator. So there are two possible ways to deal with translations: transform the bras and kets, *or* transform the operators. We shall almost always leave the operators alone, and transform the bras and kets.

We have established that *the momentum operator is the generator of spatial translations* (the generalization to three dimensions is trivial). We know from earlier work that the Hamiltonian is the generator of *time* translations, by which we mean

\[ \psi(t+a)=e^{-iHa/\hbar} \psi(t). \label{4.1.6}\]

It is tempting to conclude that the *angular momentum* must be the operator generating *rotations* of the system, and, in fact, it is easy to check that this is correct. Let us consider an infinitesimal rotation \(\delta\vec{\theta}\) about some axis through the origin (the infinitesimal vector being in the direction of the axis). A wavefunction \(\psi(\vec{r})\)initially localized at \(\vec{r_0}\) will shift to be localized at \(\vec{r_0}+\delta\vec{r_0}\), where \(\delta\vec{r_0}=\delta\vec{\theta}\times \vec{r_0}\). So, how does a wave function transform under this small rotation? Just as for the translation case, \(\psi(\vec{r})\to \psi(\vec{r}-\delta\vec{r})\). If you don’t understand the minus sign, reread the discussion on translations and the sign of \(\varepsilon\) .

Thus

\[ \psi(\vec{r})\to \psi(\vec{r})-\dfrac{i}{\hbar}\delta\vec{r}.\hat{\vec{p}}\psi(\vec{r}) \label{4.1.7}\]

to first order in the infinitesimal quantity, so the rotation operator

\[ R(\delta\vec{\theta})\psi(\vec{r})=(1-\dfrac{i}{\hbar}\delta\vec{\theta}\times \vec{r}.\hat{\vec{p}})\psi(\vec{r})=(1-\dfrac{i}{\hbar}\delta\vec{\theta}.\vec{r}\times \hat{\vec{p}})\psi(\vec{r})=(1-\dfrac{i}{\hbar}\delta\vec{\theta}.\hat{\vec{L}})\psi(\vec{r}). \label{4.1.8}\]

If we write this as

\[ R(\delta\vec{\theta})\psi(\vec{r})=e^{-\dfrac{i}{\hbar}\delta\vec{\theta}.\hat{\vec{L}}}\psi(\vec{r}) \label{4.1.9}\]

it is clear that a finite rotation is given by multiplying together a large number of these operators, which just amounts to replacing \(\delta\vec{\theta}\) by \(\vec{\theta}\) in the exponential. Another way of going from the infinitesimal rotation to a full rotation is to use the identity

\[ \lim_{N\to\infty}(1+\dfrac{A\theta}{N})^N=e^{A\theta} \label{4.1.10}\]

which is clearly valid even if \(A\) is an operator.

We have therefore established that the orbital angular momentum operator \(\hat{\vec{L}}\) is the generator of spatial rotations, by which we mean that if we rotate our apparatus, and the wave function with it, the appropriately transformed wave function is generated by the action of \(R(\vec{\theta})\) on the original wave function. It is perhaps worth giving an explicit example: suppose we rotate the system, and therefore the wave function, through an infinitesimal angle \(\delta\theta_z\)_{ }about the z -axis. Denote the rotated wave function by \(\psi_{rot}(x,y)\). Then

\[ \psi_{rot}(x,y)=(1-\dfrac{i}{\hbar}(\delta\theta_z)\hat{L_z})\psi(x,y)=(1-\dfrac{i}{\hbar}(\delta\theta_z)(-i\hbar (x\dfrac{d}{dy}-y\dfrac{d}{dx})))\psi(x,y)=(1-(\delta\theta_z)(x\dfrac{d}{dy}-y\dfrac{d}{dx}))\psi(x,y)=\psi(x+(\delta\theta_z)y, y-(\delta\theta_z)x). \label{4.1.11}\]

That is to say, the value of the new wave function at \((x,y)\) is the value of the old wave function at the point which was rotated into \((x,y)\).

# Quantum Generalization of the Rotation Operator

However, it has long been known that in quantum mechanics, orbital angular momentum is *not *the whole story. Particles like the electron are found experimentally to have an internal angular momentum, called spin. In contrast to the spin of an ordinary macroscopic object like a spinning top, the electron’s spin is *not* just the sum of orbital angular momenta of internal parts, and any attempt to understand it in that way leads to contradictions.

To take account of this new kind of angular momentum, we generalize the orbital angular momentum \(\hat{\vec{L}}\) to an operator \(\hat{\vec{J}}\) which is *defined* as the generator of rotations on *any* wave function, including possible spin components, so

\[ R(\delta \vec{\theta})\psi(\vec{r})=e^{-\frac{i}{\hbar} \delta \vec{\theta}.\hat{\vec{J}}}\psi(\vec{r}). \label{4.1.12}\]

This is of course identical to the equation we found for \(\hat{\vec{L}}\), but there we derived if from the quantum angular momentum operator including the momentum components written as differentials. But up to this point \(\psi(\vec{r})\) has just been a complex valued function of position. From now on, the wave function at a point can have *several components*, so it is in some vector space, and the rotation operator will operate in this space as well as being a differential operator with respect to position. For example, the wave function could be a vector at each point, so rotation of the system could rotate this vector as well as moving it to a different \(\vec{r}\).

To summarize: \(\psi(\vec{r})\) is in general an n -component function at each point in space, \(R(\delta \vec{\theta})\) is an \(n\times n\) matrix in the component space, and the above equation is the *definition* of \(\hat{\vec{J}}\). Starting from this definition, we will find \(\hat{\vec{J}}\)’s properties.

The first point to make is that in contrast to translations, rotations do not commute even for a classical system. Rotating a book through \(\pi/2\) first about the z -axis then about the x -axis leaves it in a different orientation from that obtained by rotating from the same starting position first \(\pi/2\) about the x -axis then \(\pi/2\) about the z -axis. Even small rotations do not commute, although the commutator is second order. Since the R-operators are representations of rotations, they will reflect this commutativity structure, and we can see just how they do that by considering ordinary classical rotations of a real vector in three-dimensional space.

The matrices rotating a vector by \(\theta\) about the x,y and z axes are

\[ R_x(\theta )=\begin{pmatrix} 1&0&0 \\ 0&\cos\theta& -\sin\theta \\ 0& \sin\theta& \cos\theta \end{pmatrix}, R_y(\theta )=\begin{pmatrix} \cos\theta& 0& \sin\theta \\ 0&1&0 \\ -\sin\theta& 0& \cos\theta \end{pmatrix}, R_z(\theta )=\begin{pmatrix} \cos\theta& -\sin\theta& 0 \\ \sin\theta& \cos\theta& 0 \\ 0&0&1 \end{pmatrix}. \label{4.1.13}\]

In the limit of rotations about infinitesimal angles (ignoring higher order terms),

\[ R_x(\varepsilon )=1+\varepsilon \begin{pmatrix} 0&0&0 \\ 0&0&-1 \\ 0&1&0 \end{pmatrix}, R_y(\varepsilon )=1+\varepsilon \begin{pmatrix} 0&0&1 \\ 0&0&0 \\ -1&0&0 \end{pmatrix}, R_z(\varepsilon )=1+\varepsilon \begin{pmatrix} 0&-1&0 \\ 1&0&0 \\ 0&0&0 \end{pmatrix}. \label{4.1.14}\]

It is easy to check that

\[ [R_x(\varepsilon ),R_y(\varepsilon )]=\varepsilon^2 \begin{pmatrix} 0&-1&0 \\ 1&0&0 \\ 0&0&0 \end{pmatrix}=R_z(\varepsilon^2)-1. \label{4.1.15}\]

The rotation operators on quantum mechanical kets must, like all rotations, follow this same pattern, that is, we must have

\[ ((1-\dfrac{i}{\hbar}\varepsilon J_x)(1-\dfrac{i}{\hbar}\varepsilon J_y)-(1-\dfrac{i}{\hbar}\varepsilon J_y)(1-\dfrac{i}{\hbar}\varepsilon J_x)+\dfrac{i}{\hbar}\varepsilon^2J_z)|\psi\rangle =0 \label{4.1.16}\]

where we have used the definition of the infinitesimal rotation operator on kets, \(R(\delta \vec{\theta})\psi(\vec{r})=e^{-\dfrac{i}{\hbar} \delta \vec{\theta}.\hat{\vec{J}}}\psi(\vec{r})\). The zeroth and first-order terms in \(\varepsilon\) all cancel, the second-order term gives \([J_x,J_y]=i\hbar J_z\). The general statement is:

\[ [J_i,J_j]=i\hbar \varepsilon_{ijk}J_k \label{4.1.17}\]

This is one of the most important formulas in quantum mechanics.

# Consequences of the Commutation Relations

The commutation formula \([J_i,J_j]=i\hbar \varepsilon_{ijk}J_k\), which is, after all, a straightforward extension of the result for ordinary classical rotations, has surprisingly far-reaching consequences: it leads directly to the directional quantization of spin and angular momentum observed in atoms subject to a magnetic field.

It is by now very clear that in quantum mechanical systems such as atoms the total angular momentum, and also the component of angular momentum in a given direction, can only take certain values. Let us try to construct a basis set of angular momentum states for a given system: a complete set of kets corresponding to all allowed values of the angular momentum. Now, angular momentum is a *vector *quantity: it has magnitude and direction. Let’s begin with the magnitude, the natural parameter is the length squared:

\[ J^2=J^2_x+J^2_y+J^2_z. \label{4.1.18}\]

Now we must specify direction -- but here we run into a problem. \(J_x\),\(J_y\) and \(J_z\) are all mutually non-commuting, so we cannot construct a set of common eigenkets of any two of them, which we would need for a precise specification of direction. They *do* all commute with \(J^2\), since it is spherically symmetric and therefore cannot be affected by any rotation (and, it’s easy to check this commutation explicitly).

The bottom line, then, is that in attempting to construct eigenkets describing the different possible angular momentum states of a quantum system, the best we can do is to find the common eigenkets of \(J^2\) and *one* direction, say \(J_z\). The commutation relations do not allow us to be more precise about direction, analogous to the Uncertainty Principle for position and momentum, which also comes from noncommutativity of the relevant operators.

We conclude that the appropriate angular momentum basis is the set of common eigenkets of the commuting Hermitian matrices \(J^2\),\(J_z\) :

\[ J^2|a,b\rangle =a|a,b\rangle J_z|a,b\rangle =b|a,b\rangle . \label{4.1.19}\]

Our next task is to find the allowed values of \(a\) and \(b\).

The Uncertainty Principle limits our knowledge about the direction of angular momenta with the best we can do is to find the common eigenkets of \(J^2\) and

onedirection, say \(J_z\).

# Ladder Operators

The sets of allowed eigenvalues \(a\), \(b\) can be found using the “ladder operator” trick previously discussed for the simple harmonic oscillator. It turns out

\[J_{\pm} =J_x\pm iJ_y \label{4.1.20}\]

are closely analogous to the simple harmonic oscillator raising and lowering operators \(a^{\dagger}\) and \(a\).

\(J_+\) and \(J_-\) have commutation relations with \(J_z\):

\[ [J_z,J_{\pm} ]=\pm \hbar J_{\pm} \label{4.1.21}\]

and they of course *commute* with \(J^2\), as do \(J_z\),\(J_x\) and \(J_y\).

Therefore, \(J_{\pm}\) operating on \(|a,b\rangle\) cannot affect the value of \(a\). But they *do* change the value of \(b\):

\[ J_zJ_{\pm} |a,b\rangle =[J_z,J_{\pm} ]|a,b\rangle +J_{\pm} J_z|a,b\rangle =\pm \hbar J_{\pm} |a,b\rangle +bJ_{\pm} |a,b\rangle =(b\pm \hbar )J_{\pm} |a,b\rangle \label{4.1.22}\]

so if \(|a,b\rangle\) is an eigenket of \(J_z\) with eigenvalue \(b\), \(J_{\pm} |a,b\rangle\) is either zero or an eigenket of \(J_z\) with eigenvalue \(b\pm \hbar\), that is, \(J_{\pm} |a,b\rangle =C_{\pm}|a,b\pm \hbar \rangle\) where \(C_{\pm} (a,b)\) is a normalization constant, taking the initial \(|a,b\rangle\) to be normalized. Just as with the simple harmonic oscillator, we have to find these normalization constants in order to compute matrix elements. All the physics is in the matrix elements.

The squared norm of \(J_{\pm} |a,b\rangle\)

\[ ||J_{\pm} |a,b\rangle ||^2=\langle a,b|J^{\dagger}_{\pm} J_{\pm} |a,b\rangle =\langle a,b|J_{\mp} J_{\pm} |a,b\rangle \label{4.1.23}\]

and

\[ J_{\mp}J_{\pm} =(J_x\mp iJ_y)(J_x\pm iJ_y)=J^2_x+J^2_y\pm i[J_x,J_y]=J^2-J^2_z\mp \hbar J_z \label{4.1.24}\]

from which

\[ ||J_{\pm} |a,b\rangle ||^2=\langle a,b|J^2-J^2_z\mp \hbar J_z|a,b\rangle =a-b^2\mp \hbar b, \label{4.1.25}\]

recalling that \(\langle a,b|a,b\rangle =1\).

Now a, being the eigenvalue of a sum of squares of Hermitian operators, is necessarily nonnegative, and \(b\) is real. Hence for a given a, \(b\) is *bounded*: there must be a \(b_{max}\) and \(a\) (negative or zero) \(b_{min}\). But this must mean that

\[ ||J_+|a,b_{max}\rangle ||^2=a-b^2_{max}-\hbar b_{max}=0, \;\; ||J_-|a,b_{min}\rangle ||^2=a-b^2_{min}+\hbar b_{min}=0. \label{4.1.26}\]

Note that for a given \(a\), \(b_{max}\) and \(b_{min}\) are determined uniquely -- there cannot be two kets with the same a but different \(b\) annihilated by \(J_+\). It also follows immediately that \(a=b_{max}(b_{max}+\hbar )\) and \(b_{min}=-b_{max}\). Furthermore, we know that if we keep operating on \(|a,b_{min}\rangle\) with \(J_+\), we generate a sequence of kets with \(J_z\) eigenvalues \(b_{min}+\hbar ,\;\; b_{min}+2\hbar ,\;\; b_{min}+3\hbar ,\dots\). This series must terminate, and the only possible way for that to happen is for \(b_{max}\) to be equal to \(b_{min}+n\hbar\) with \(n\) an integer, from which it follows that \(b_{max}\) is either an integer or half an odd integer times \(\hbar\) .

At this point, we switch to the standard notation. We have established that the eigenvalues of \(J_z\) form a finite ladder, spacing \(\hbar\). We write them as \(m\hbar\), and \(j\) is used to denote the maximum value of \(m\), so the eigenvalue of \(J^2\), \(a=j(j+1)\hbar^2\). Both \(j\) and \(m\) will be integers or half odd integers, but the *spacing* of the ladder of \(m\) values is always unity. Although we have been writing \(|a,b\rangle\) with \(a=j(j+1)\hbar^2\), \(b=m\hbar\) we shall henceforth follow convention and write \(|j,m\rangle\).

# Summary

The operators \(\vec{J}^2\), \(J_z\) have a common set of orthonormal eigenkets \(|j,m\rangle\),

\[ \begin{matrix} \vec{J}^2|j,m\rangle =j(j+1)\hbar^2|j,m\rangle \\ J_z|j,m\rangle =m\hbar |j,m\rangle \\ \langle j,m|j,m\rangle =1 \end{matrix} \label{4.1.27}\]

where \(j\), \(m\) are integers or half integers. The allowed quantum numbers \(m\) form a ladder with step spacing unity, the maximum value of \(m\) is \(j\), the minimum value is \(-j\).

# Normalizing *J*_{+} and *J*_{-}

It is now straightforward to compute the normalization factors needed to find matrix elements:

\[ ||J_{\pm} |j,m\rangle ||^2=\langle j,m|J^2-J^2_z\mp \hbar J_z|j,m\rangle =(j(j+1)\hbar^2-m(m\pm 1)\hbar^2)\langle j,m|j,m\rangle , \label{4.1.28}\]

and \(\langle j,m|j,m\rangle =1\), so

\[ \begin{matrix} J_+|j,m\rangle =\sqrt{j(j+1)-m(m+1)} \hbar |j,m+1\rangle \\ J_-|j,m\rangle =\sqrt{j(j+1)-m(m-1)} \hbar |j,m-1\rangle \end{matrix}. \label{4.1.29}\]

With these formulas, and the base set of normalized eigenkets \(|j,m\rangle\) , we are in a position to construct explicit matrix representations of the angular momentum algebra for any integer or half integer value of angular momentum \(j\).

*Historical note*: the use of \(m\) to denote the component of angular momentum in one direction came about because a Bohr-type electron in orbit is a current loop, with a magnetic moment parallel to its angular momentum, so the \(m\) measured the component of magnetic moment in a chosen direction, usually along an external magnetic field, and \(m\) is often called the *magnetic *quantum number.

# Contributors

- Michael Fowler (Beams Professor, Department of Physics, University of Virginia)