# 5.3: Heat Capacity of an Ideal Gas

- Page ID
- 6359

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)## 5.3.1 The equipartition theorem

**Equipartition theorem**. (For classical systems.) Suppose the Hamiltonian *H*(Γ) decouples into one piece involving a single phase space variable—call it H1, plus another piece which involves all the other phase space variables—call it *H*_{2}(Γ_{2}). Suppose further that the energy depends quadratically upon this single phase space variable, and that this variable may take on values from −∞ to +∞. Then, in classical statistical mechanics, the mean contribution to the energy due to that single variable is

\[ \left\langle H_{1}\right\rangle=\frac{1}{2} k_{B} T.\]

Notice how general this theorem is. The remaining piece of the Hamiltonian, *H*_{2}(Γ_{2}), might decouple further or it might not. The phase space variable entering *H*_{1} might be a momentum,

\[ H_{1}(p)=\frac{p^{2}}{2 m},\]

or an angular momentum,

\[ H_{1}(\ell)=\frac{\ell^{2}}{2 I},\]

or even a position coordinate, as in the simple harmonic oscillator energy

\[ H_{1}(x)=\frac{1}{2} k x^{2}.\]

Furthermore, in all these circumstances, the mean energy is *independent* of the particular parameters *m* or *I* or *k*. . . it depends only upon the temperature. This explains the origin of the name “equipartition”: the mean translational energy due to motion in the *x* direction is equal to the mean rotational energy due to the change of *θ*, and this holds true even if the gas is a mixture of molecules with different masses and different moments of inertia. The energy is equally partitioned among all these different ways of holding energy.

**Proof**. We will write

\[ H_{1}(p)=a p^{2},\]

although the variable might not be a linear momentum. Then the average of *H*_{1} is

\[ \left\langle H_{1}\right\rangle=\frac{\int d \Gamma H_{1} e^{-\beta H(\Gamma)}}{\int d \Gamma e^{-\beta H(\Gamma)}}=\frac{\int_{-\infty}^{+\infty} d p H_{1} e^{-\beta H_{1}} \int d \Gamma_{2} e^{-\beta H_{2}\left(\Gamma_{2}\right)}}{\int_{-\infty}^{+\infty} d p e^{-\beta H_{1}} \int d \Gamma_{2} e^{-\beta H_{2}\left(\Gamma_{2}\right)}}.\]

Clearly, the integrals over Γ_{2} cancel in this last expression. (This explains why the form of *H*_{2} is irrelevant to the theorem.) We are left with

\[ \left\langle H_{1}\right\rangle=\frac{\int_{-\infty}^{+\infty} d p a p^{2} e^{-\beta a p^{2}}}{\int_{-\infty}^{+\infty} d p e^{-\beta a p^{2}}}.\]

These two integrals could be evaluated in terms of Gamma functions (see appendix C), but they don’t need to be evaluated yet. Think for a moment about our “slick trick” of parametric differentiation. . . using it we can write

\[ \left\langle H_{1}\right\rangle=-\frac{d}{d \beta} \ln \left[\int_{-\infty}^{+\infty} d p e^{-\beta a p^{2}}\right].\]

The integral that remains is of Gaussian character and we could evaluate it using the results of Appendix B. But before rushing in to integrate, let’s employ the substitution \( u=\sqrt{\beta a} p\) to find

\[ \left\langle H_{1}\right\rangle=-\frac{d}{d \beta} \ln \left[\frac{1}{\sqrt{\beta a}} \int_{-\infty}^{+\infty} d u e^{-u^{2}}\right]=-\frac{d}{d \beta}\left\{\ln \left[\frac{1}{\sqrt{\beta}}\right]+\ln \left[\frac{1}{\sqrt{a}} \int_{-\infty}^{+\infty} d u e^{-u^{2}}\right]\right\}.\]

This last expression shows that there’s no need to evaluate the integral. Whatever its value is, it is some number, not a function of *β*, so when we take the derivative with respect to *β* the term involving that number will differentiate to zero. Similarly for the constant a, which explains why the equipartition result is independent of that prefactor. We are left with

\[ \left\langle H_{1}\right\rangle=-\frac{d}{d \beta} \ln \frac{1}{\sqrt{\beta}}=\frac{1}{2} \frac{d}{d \beta} \ln \beta=\frac{1}{2} \frac{1}{\beta}\]

or, finally, the desired equipartition result

\[ \left\langle H_{1}\right\rangle=\frac{1}{2} k_{B} T.\]

## 5.3.2 Applications of equipartition; Comparison with experiment

## 5.3.3 Crossover between classical and quantal behavior; Freeze out

At high temperatures, typical thermal energies are much greater than level spacings. Transitions from one level to another are very easy to do and the granular character of the quantized energies can be ignored. This is the classical limit, and equipartition holds!

At low temperatures, typical thermal energies are less that the level spacing between the ground state and the first excited state. There is so little thermal energy around that the molecule cannot even be excited out of its ground state. Virtually all the molecules are in their ground states, and the excited states might as well just not exist.

In classical mechanics, a diatomic molecule offered a small amount of rotational energy will accept that energy and rotate slowly. But in quantum mechanics, a diatomic molecule offered a small amount of rotational energy will reject that energy and remain in the ground state, because the energy offered is not enough to lift it into the first excited state.^{2} The quantal diatomic molecule does not rotate at all at low temperatures, so it behaves exactly like a monatomic molecule with only center-of-mass degrees of freedom.

In short, we explain the high-temperature rotational specific heat (*c*^{rot}_{V} = *k _{B}*) through equipartition. We explain the low-temperature rotational specific heat (

*c*

^{rot}

_{V}vanishes) through difficulty of promotion to the first quantal excited state. This fall-off of specific heat as the temperature is reduced is called “freeze out”.

The crossover between the high-temperature and low-temperature regimes occurs in the vicinity of a characteristic temperature *θ* at which the typical thermal energy is equal to energy separation between the ground state and the first excited state. If the energies of these two states are ε_{0} and ε_{1} respectively, then we define the characteristic crossover temperature through

\[ k_{B} \theta \equiv \epsilon_{1}-\epsilon_{0}\]

*5.2 Generalized equipartition theorem and the ultra-relativistic gas *

a. Suppose the Hamiltonian *H*(Γ) decouples into two pieces

\[ H(\Gamma)=a|p|^{n}+H_{2}\left(\Gamma_{2}\right)\]

where *p* is some phase space variable that may take on values from −∞ to +∞, and where Γ_{2} represents all the phase space variables except for *p*. (Note that the absolute value |*p*| is needed in order to avoid, for example, taking the square root of a negative number in the case n = 1/2.) Show that, in classical statistical mechanics, the mean contribution to the energy due to that single variable is

\[ \left\langle a|p|^{n}\right\rangle=\frac{1}{n} k_{B} T\]

b. In special relativity, the energy of a free (i.e. non-interacting) particle is given by

\[ \sqrt{\left(m c^{2}\right)^{2}+(p c)^{2}}\]

where *c* is the speed of light. As you know, when \( v \ll c\) this gives the non-relativistic kinetic energy KE ≈ *mc*^{2} + *p*^{2}/2*m*. In the “ultra-relativistic” limit, where *v* is close to *c*, the energy is approximately *pc*. What is the heat capacity of a gas of non-interacting ultra-relativistic particles?

c. Estimate the crossover temperature between the non-relativistic and ultra-relativistic regimes.

*5.3 Another generalization of equipartition *

Consider the same situation as the equipartition theorem in the text, but now suppose the single phase space variable takes on values from 0 to +∞. What is the corresponding result for \(\left\langle H_{1}\right\rangle\)?

*5.4 Equipartition and the virial theorem *

Look up the term “virial theorem” in a classical mechanics textbook. Is there any relation between the virial theorem of classical mechanics and the equipartition theorem of classical statistical mechanics?

^{2}This paragraph is written in the “shorthand” language discussed on page 113, as if energy eigenstates were the only allowed quantal states.

The \(\mathcal{O}\) Notation

Approximations are an important part of physics, and an important part of approximation is to ensure their reliability and consistency. The \( \mathcal{O}\) notation (pronounced “the big-oh notation”) is an important and practical tool for making approximations reliable and consistent. The technique is best illustrated through an example. Suppose you desire an approximation for

\[ f(x)=\frac{e^{-x}}{1-x}\]

valid for small values of x, that is, \(x \ll 1\). You know that

\[ e^{-x}=1-x+\frac{1}{2} x^{2}-\frac{1}{6} x^{3}+\cdots\]

and that

\[ \frac{1}{1-x}=1+x+x^{2}+x^{3}+\cdots\]

so it seems that reasonable approximations are

\[ e^{-x} \approx 1-x\]

and

\[ \frac{1}{1-x} \approx 1+x\]

whence

\[ \frac{e^{-x}}{1-x} \approx(1-x)(1+x)=1-x^{2}.\]

Let’s try out this approximation at *x*_{0} = 0.01. A calculator shows that

\[ \frac{e^{-x_{0}}}{1-x_{0}}=1.0000503 \ldots\]

while the value for the approximation is

\[ 1-x_{0}^{2}=0.9999000.\]

This is a very poor approximation indeed. . . the deviation from* f*(0) = 1 is even of the wrong sign!

Let’s do the problem over again, but this time keeping track of exactly how much we’ve thrown away while making each approximation. We write

\[ e^{-x}=1-x+\frac{1}{2} x^{2}-\frac{1}{6} x^{3}+\cdots\]

as

\[ e^{-x}=1-x+\frac{1}{2} x^{2}+\mathcal{O}\left(x^{3}\right),\]

where the notation \(\mathcal{O}\)(*x*^{3}) stands for the small terms that we haven’t bothered to write out explicitly. The symbol \(\mathcal{O}\)(*x*^{3}) means “terms that are about the magnitude of *x*^{3}, or smaller” and is pronounced “terms of order *x*^{3}”. The \(\mathcal{O}\) notation will allow us to make controlled approximations in which we keep track of exactly how good the approximation is.

Similarly, we write

\[ \frac{1}{1-x}=1+x+x^{2}+\mathcal{O}\left(x^{3}\right),\]

and find the product

\[ f(x)=\left[1-x+\frac{1}{2} x^{2}+\mathcal{O}\left(x^{3}\right)\right] \times\left[1+x+x^{2}+\mathcal{O}\left(x^{3}\right)\right]\]

\[ =\quad\left[1-x+\frac{1}{2} x^{2}+\mathcal{O}\left(x^{3}\right)\right]\]

\[ +\left[1-x+\frac{1}{2} x^{2}+\mathcal{O}\left(x^{3}\right)\right] x\]

\[ +\left[1-x+\frac{1}{2} x^{2}+\mathcal{O}\left(x^{3}\right)\right] x^{2}\]

\[ +\left[1-x+\frac{1}{2} x^{2}+\mathcal{O}\left(x^{3}\right)\right] \mathcal{O}\left(x^{3}\right)\]

Note, however, that \( x \times \frac{1}{2} x^{2}=\mathcal{O}\left(x^{3}\right)\), and that \( x^{2} \times \mathcal{O}\left(x^{3}\right)=\mathcal{O}\left(x^{3}\right)\), and so forth, whence

\[ f(x)=\left[1-x+\frac{1}{2} x^{2}+\mathcal{O}\left(x^{3}\right)\right]\]

\[ +\left[x-x^{2}+\mathcal{O}\left(x^{3}\right)\right]\]

\[ +\left[x^{2}+\mathcal{O}\left(x^{3}\right)\right]\]

\[ +\mathcal{O}\left(x^{3}\right)\]

\[ =1+\frac{1}{2} x^{2}+\mathcal{O}\left(x^{3}\right)\]

Thus we have the approximation

\[ f(x) \approx 1+\frac{1}{2} x^{2}\]

Furthermore, we know that this approximation is accurate to terms of order \(\mathcal{O}\)(*x*^{2}) (i.e. that the first neglected terms are of order \(\mathcal{O}\)(*x*^{3})). Evaluating this approximation at *x*_{0} = 0.01 gives

\[ 1+\frac{1}{2} x_{0}^{2}=1.0000500,\]

far superior to our old approximation (5.35).

What went wrong on our first try? The −*x*^{2} in approximation (5.35) is the same as the −*x*^{2} on line (5.47). However, lines (5.46) and (5.48) demonstrate that there were other terms of about the same size (that is, other “terms of order *x*^{2}”) that we neglected in our first attempt.

The \(\mathcal{O}\) notation is superior to the “dot notation” (such as · · ·) in that dots stand for “a bunch of small terms”, but the dots don’t tell you just how small they are. The symbol \(\mathcal{O}\)(*x*^{3}) also stands for “a bunch of small terms”, but in addition it tells you precisely how small those terms are. The \(\mathcal{O}\) notation allows us to approximate in a consistent manner, unlike the uncontrolled approximations where we ignore a “small term” without knowing whether we have already retained terms that are even smaller.