7.4: Boltzmann Distribution
- Page ID
- 17224
Distributions
Whenever we have a collection of items with various properties, we can talk about how those properties are distributed amongst the items. Sometimes these properties are distributed in a very uninteresting manner – uniformly, for example. But a well-organized and discernable pattern in the distribution of properties in a population can give us a great deal of information. In addition, the effect of external factors on these distributions can tell us even more. For example, one could plot the number of people in the U.S. that retire each year, as a function of their age. Imposing constraints such as what state they live in, or the profession from which they retire, gives new distribution curves, from which additional information can be gleaned.
In the case of a collection of particles, the dominant property that we use as a criterion for a distribution is energy. We have a constraint that our collection conserves energy – there is no energy gained or lost by our collection of particles. While we generally assume that the particles don't interact with each other (at least not in the classical sense of inter-particle forces – their wave functions can overlap, which in some sense is 'interaction'), we nevertheless allow for the energy to distribute itself 'randomly' (whatever that means) amongst the particles, and ask what this distribution looks like.
Given that the energy is distributed randomly, every specific case is on the table – all of the energy could go to one particle, while all the others have zero, for instance. But we are interested in the distribution we will observe, and from our studies of thermal and statistical physics, we know that this means the equilibrium state, which we define as the macrostate with the largest number of microstates. Clearly there are many more ways of distributing the energy that look the same to us macrosopically than giving all the energy to a single particle, which is why we never observe this anomalous state.
Distinguishable Particles
In previous sections, when we were confined to talking about quantum mechanics of collections of particles, we used the words "indistinguishable" and "identical" interchangeably, but we need to take a moment here to discuss the subtle difference in our usage of these words. If we happen to be talking about a collection of all the same type of particle, then those particles are identical, but if we happen to be in the realm of classical physics (for example, a collection of identical grains of sand), then none of the quantum mechanical effects like the exclusion principle apply. While the grains of sand are identical, we can "distinguish" them by, for example, watching their individual motions carefully.
Even in the realm of very small particles like electrons, they can have very localized wave functions, making the collection – in principle, if not in practice – distinguishable. But as soon as we make the circumstances such that the wave functions of the various particles overlap significantly (so that when we make a measurement we truly don't know which electron we have found), then these particles become indistinguishable, and quantum effects like exclusion take hold.
The problem of the energy distribution of a collection of identical-but-distinguishable particles was solved by Boltzmann, and this is covered in classes on statistical physics such as Physics 9HB. Boltzmann did this well before the advent of quantum theory, so the "distinguishable" criterion was not known to him, though we will see that it is critical. We start with a discrete set of states, such as we would find for a bound quantum-mechanical particle. According to Boltzmann, the probability that a randomly-sampled particle will be found to be in state \(n\) (giving it an energy \(E_n\)) is given by:
\[ P\left(E_n\right) = Ae^{-E_n/k_BT} \]
The coefficient \(A\) is a constant that assures that this is a probability, \(T\) is the temperature of the collection of particles, and \(k_B\) is the constant that bears Boltzmann's name. We find \(A\) in the same way that we find any normalization constant for a wave function – set the sum of all the probabilities equal to one:
\[ 1=\sum\limits_nP\left(E_n\right) = A\sum\limits_ne^{-E_n/k_BT} \;\;\; \Rightarrow \;\;\; A=\dfrac{1}{\sum\limits_ne^{-E_n/k_BT}} \;\;\; \Rightarrow \;\;\; P\left(E_n\right) = \dfrac{e^{-E_n/k_BT}}{\sum\limits_ne^{-E_n/k_BT}}\]
It is a subtle-but-important fact that the "\(n\)" that appears in the subscript above is a shorthand that completely defines the state. Recall that for 1-dimensional bound states, this designation uniquely defined the state and the energy, but when we moved on to three dimensions, we found that a single subscript was no longer sufficient – more quantum numbers appeared. Most notably, we found cases where multiple states exhibited the same energy (we called these "degenerate states"). The subscript \(n\) above therefore abstractly represents all the quantum numbers possible for the states that exhibit the energy \(E_n\). The sum over \(n\) is therefore a sum over states, and multiple terms can contain the same energy values if there is a degeneracy.
This is a probability distribution, and like any probability distribution, we can use it to compute an average:
\[ \left<\;E\;\right> = \sum\limits_n E_n \;P\left(E_n\right) = A\sum\limits_n E_n \;e^{-E_n/k_BT} \]
Occupation Number
For many applications, it can be more useful to have a measure of how many particles are in a state with a given energy, rather than the probability of being in that state. This occupation number of a state is simply the fraction of the total number of particles \(N\) that are in state \(n\), and is expressed with yet another version of that letter:
\[ \mathcal{N}\left(E_n\right) = N\;P\left(E_n\right)\]
For the case of distinguishable particles, this is the Boltzmann distribution:
\[ \mathcal{N}\left(E_n\right) = NAe^{-E_n/k_BT}\]
Alert
Remember, this is the population of the state designated by \(n\), not the number of particles with energy \(E_n\), though all of these particles do have the same energy. If there is no degeneracy, then this does equal the number of particles with that energy, otherwise to obtain this number one must multiply the occupation number by the order of the degeneracy.
Given that the occupation number differs from the probability only by a factor of the total number of particles, we can rewrite the equation we use for computing the average energy in terms of the occupation number:
Approximating with a Continuum
The power of statistical mechanics comes from the large number of particles that are involved, and unless the temperature is very low, many energy levels are available to the particles in the system. These are prime conditions for changing from the clunky use of summations to using integration. Let's see how we would go about doing this.
Let's look at a completely fictional (not really physical) example of 30 particles. In this made-up collection of particles, the energy levels grow in proportion to the square of the level:
\[ E_n = n^2\hbar\omega\;,\;\;\; n=0,\;1,\;2,\;\dots\]
We'll further construct these states so that they are (\(n+1\))-fold degenerate – there are \(n+1\) unique quantum states for energy level \(E_n\). And finally, in our fictional universe, we'll assume that the distribution of particles amongst the states is as shown in the chart below (the occupation numbers are zero for all higher energy levels than those shown).
Figure 7.4.1 – Fictional Distribution of Particles
The degenerate states are labeled with a, b, ... Each degenerate state is given the same occupation number (which can even be fractional), because there is nothing in our system that would make one state more likely than another if they exhibit the same energy. These are just relative values, because ultimately what we are modeling is how frequently we will find each condition when we select a particle at random from the collection. So for example, it is 8 times more likely that we will select a particle from state 1a than from state 4c.
We can of course use this to compute the average energy per particle, by multiplying the probability of each energy by the value of the energy and summing over all the possibilities:
\[\left<\;E\;\right> = \frac{8}{30}E_0+\frac{8}{30}E_1+\frac{6}{30}E_2+\frac{4}{30}E_3+\frac{2.5}{30}E_4+\frac{1.5}{30}E_5 = \frac{1}{30}\left[8(0)+8(1)+6(4)+4(9)+2.5(16)+1.5(25)\right]\hbar\omega = 4.85\hbar\omega\]
Okay, let's get back to the task at hand. Let's construct a (somewhat awkward) means of averaging the energy as follows:
- Plot the occupation numbers on a the vertical axis of a graph where the horizontal axis is the energy:
Figure 7.4.2.a – Fictional Occupation Number Graph
- Construct rectangles whose upper-left corners coincide with each of these points, and are side-by side (the colors of these rectangles are coordinated with the colors in the table given in Figure 7.4.1):
Figure 7.4.2.b – Fictional Occupation Number Graph with Rectangles
- Take the area of a rectangle and multiply it by the degeneracy of that state, then divide it by the width of the rectangle. Construct a number like this for every rectangle. This gives the number of particles associated with every rectangle. A sum of all these is the total number of particles.
- Now multiply the number of particles associated with every rectangle by the energy associated with the same rectangle, and sum these over all the rectangles. Dividing this by the total number of particles gives the average energy.
Every rectangle's width represents the span of energy covered by going to each level. In this case, we can see that the levels get farther apart as the energy goes up. If we were to step back and look at collections of energy levels from a distance, we would notice that the levels for this fictional collection are more densely-packed at lower energies. We can even define a "density" as a ratio of 1 energy level per span. That is, the "energy level density" can be defined as:
\[\text{energy level density} = \dfrac{1}{\text{width of rectangle for that energy level}} \]
There are \(n\) states for each energy level, so we can define a density of quantum states by simply multiplying the energy level density by the degeneracy:
But in step 3 above, this was exactly the number that we multiplied by the area of each rectangle before summing. This quantity is a function of energy (in the case above, it is a piecewise function), which we call the density of states, which we generally write as \(D\left(E\right)\). In general, this function contains information about both how close the energy levels near energy \(E\) are to their neighbors, and about the degeneracy of the states for that energy.
Labeling the density of states according to its rectangle, the average energy of this set of particles (which we got in step 4 above), we get this for the average energy per particle:
\[ \left<\;E\;\right> = \dfrac{E_0\left[{Area}_0\;D\left(E_0\right)\right]+E_1\left[{Area}_1\;D\left(E_1\right)\right]+\cdots}{ {Area}_0\;D\left(E_0\right)+{Area}_1\;D\left(E_1\right)+\cdots}\]
Finally, let's imagine that there are many, many more particles, and many, many more energy levels occupied, and that the gaps between the energy levels are extremely small compared to the total energy of the system. In this case, the plotted occupation number points merge into a smooth curve:
Figure 7.4.3 – Occupation Number Graph
The areas of the rectangles become simply the product of the height of the curve \(\mathcal{N}\left(E\right)\) at each point, multiplied by an infinitesimal width \(dE\). The density of states we can describe as the degeneracy of quantum states at energy \(E\), divided by the energy gap between the quantum states at \(E\) and \(E+dE\). The result is a function \(D\left(E\right)\) that has units of energy\(^{-1}\). This gives us a means for computing the average energy with integrals:
It's not hard to see why the form of this result differs from that of Equation 7.4.6. When summing over a discrete set of states, we don't have to worry about the degeneracy – each state is counted, which is not the case when we integrate over energies. In addition, as we step from one state to the next, this is handled by the index of the summation, while we have to tell an integral how big of a step to take to get to the next state. The density of states function takes care of both of these problems for the integral. It's a small price to pay for being able to integrate a function instead of doing an infinite sum.
A Simple Example
While it helped to explain the density of states function, the example above was made-up (the distribution was not Boltzmann), so let's look at a very simple physical example. Let's let the particles coexist in a one-dimensional harmonic oscillator potential, so that they individually exhibit the energy spectrum:
\[ E_n = \left(n+\frac{1}{2}\right)\hbar\omega_c\;,\;\;\;\;\;\;\;\;n=0,\;1,\;2,\;\dots\]
We will simplify things somewhat by choosing the zero point of our potential energy differently (we are always free to do this). Doing this, we see that the energy levels start at zero, and are evenly-spaced. This is a one-dimensional well, so there is no degeneracy. Therefore the density of states function is very easy to come up with. Following the prescription of Equation 7.4.10, we see that the density of states for this situation is:
\[ D\left(E\right) = \dfrac{1}{\hbar\omega_c} \]
We can use this to compute the average energy per particle by combining the occupation number for the Boltzmann distribution (Equation 7.4.5) with Equation 7.4.12. Note that we are now treating the energy as a continuous variable, so we drop the "\(n\)" subscript from the Boltzmann occupation number function:
If we perform the calculation of the average energy per particle for this physical system using the summation and Equation 7.4.5, we get the following:
\[ \left<\;E\;\right> = \dfrac{\hbar\omega_c}{e^{\hbar\omega_c/k_BT}-1} \]
How do we reconcile these two different results? To use the integral approximation of the sum, we assumed that the energy levels are very close together, but how do we define "very close?" Well, it turns out that the answer is, "compared to \(k_BT\)." If \(\hbar\omega_c \ll k_BT\), then the power series expansion of the exponential can be well-approximated by its first two terms:
\[ e^{\hbar\omega_c/k_BT} \approx 1+\dfrac{\hbar\omega_c}{k_BT} \]
Plugging this in above shows that the answer obtained by the integral converges to the answer obtained by the sum.