1.5: The Ergodic Theorem and the Virial Theorem
- Page ID
- 141443
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\(\newcommand{\longvect}{\overrightarrow}\)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Thus far, with the exception of a brief discussion in Section 2, we have developed Lagrange's identity in a variety of ways, but have not rigorously taken that finial step to produce the virial theorem. This last step involves averaging over time and it is in this form that the theorem finds its widest application. However, in astrophysics few if any investigators live long enough to perform the time-averages for which the theorem calls. Thus, one more step is needed. It is this step which occasionally leads to difficulty and erroneous results. In order to replace the time averages with something observable, it is necessary to invoke the ergodic theorem.
The Ergodic Theorem is one of those fundamental physical concepts like the Principle of Causality which are so "obvious" as to appear axiomatic. Thus they are rarely discussed in the physics literature. However, to say that the ergodic theorem is obvious is to belittle an entire area of mathematics known as ergodic theory which uses the mathematical language of measure theory. This language alone is enough to hide it forever from the eye of the average physical scientist. Since this theorem is central to obtain what is commonly called the virial theorem, it is appropriate that we spend a little time on its meaning. As noted in the introduction, the distinction between an ensemble average and an average of macroscopic system parameters over time was not clear at the time of the formulation of the virial theorem. However, not too long after, Ludwig Boltzmann6 formulated an hypothesis which suggested the criterion under which ensemble and phase averages would be the same. Maxwell later stated it this way: "The only assumption which is necessary for a direct proof is that the system if left to itself in its actual state of motion will, sooner or later, pass through every phase which is consistent with the Equation of energy".7
Essentially this constitutes what is most commonly meant by the ergodic theorem. Namely, if a dynamic system passes through every point in phase space then the time average of any macroscopic system parameter, say Q, is given by
\[\langle\mathrm{Q}\rangle_{\mathrm{t}}=\frac{\operatorname{Lim}}{\mathrm{T} \rightarrow \infty}\left[\frac{1}{\mathrm{T}}\right] \int_{\mathrm{t}_0}^{\mathrm{t}_0+\mathrm{T}} \mathrm{Q}(\mathrm{t}) \mathrm{dt}=\langle\mathrm{Q}\rangle_{\mathrm{s}}\label{1.5.1}\]
where <Q>s is some sort of instantaneous statistical average of Q over the entire system.
The importance of this concept for statistical mechanics is clear. Theoretical considerations predict <Q>s whereas experiment provides something which might be construed to approximately <Q>t.
No matter how rapid the measurements of something like the pressure or temperature of the gas, it requires a time which is long compared to characteristic times for the system. The founders of statistical mechanics, such as Boltzmann, Maxwell and Gibbs, realized that such a statement as Equation \ref{1.5.1} was necessary to enable the comparison of theory with experiment and thus a great deal of effort was expended to show or at least define the conditions under which dynamical systems were ergodic (i.e., would pass through every point in phase space).
Indeed, as stated, the ergodic theorem is false as was shown independently in 1913 by Rosenthal8 and Plancherel9 more modern version of this can be seen easily by noting that no system trajectory in phase space may cross itself. Thus, such a curve may have no multiple points. This is effectively a statement of system boundary conditions uniquely determining the system's past and future. It is the essence of the Louisville theorem of classical mechanics. Such a curve is topologically known as a Jordan curve and it is a well known topological theorem that a Jordan curve cannot pass through all points of a multi-dimensional space. In the language of measure theory, a multi-dimensional space filling curve would have a measure equal to the space whereas a Jordan curve being one-dimensional would have measure zero. Thus, the ergodic hypothesis became modified as the quasi-ergodic hypothesis. This modification essentially states that although a single phase trajectory cannot pass through every point in phase space, it may come arbitrarily close to any given point in a finite time. Already one can sense confusion of terminology beginning to mount. Ogorodnikov10 uses the term quasi-ergodic to apply to systems covered by the Lewis theorem which we shall mention later. At this point in time the mathematical interest in ergodic theory began to rise rapidly and over the next several years attracted some of the most, famous mathematical minds of the 20th century. Farquhar11 points out that several noted physicists stated without justification that all physical systems were quasi-ergodic. The stakes were high and were getting higher with the development of statistical mechanics and the emergence of quantum mechanisms as powerful physical disciplines. The identity of phase and time averages became crucial to the comparison of theory with observation.
Mathematicians largely took over the field developing the formidable literature currently known as ergodic theory; and they became more concerned with showing the existence of the averages than with their equality with phase averages. Physicists, impatient with mathematicians for being unable to prove what appears 'reasonable', and also what is necessary, began to require the identity of phase and time averages as being axiomatic. This is a position not without precedent and a certain pragmatic justification of expediency. Some essentially adopted the attitude that since thermodynamics “works”, phase and times averages must be equal. However, as Farquhar observed “such a pragmatic view reduces statistical mechanics to an ad hoc technique unrelated to the rest of physical theory.” 12
Over the last half century, there have been many attempts to prove the quasi-ergodic hypothesis. Perhaps the most notable of which are Birkhoff's theorem13 and the generalization of a corollary known as Lewis' theorem.14 These theorems show the existence of time averages and their equivalence to phase averages under quite general conditions. The tendency in recent years has been to bypass phase space filling properties of a dynamical system and go directly to the identification of the equality of phase and time averages. The most recent attempt due to Siniai15, as recounted by Arnold and Avez16 proves that the Boltzmann-Gibbs conjecture is correct. That is, a "gas" made up of perfectly elastic spheres confined by a container with perfectly reflecting walls is ergodic in the sense that phase and time averages are equal.
At this point the reader is probably wondering what all this has to do with the virial theorem. Specifically, the virial theorem is obtained by taking the time average of Lagrange's identity. Thus
\[\frac{\operatorname{Lim}}{\mathrm{T} \rightarrow \infty}\left[\frac{1}{2}\right] \int_{\mathrm{t}_0}^{\mathrm{t}_0+\mathrm{T}}\left(\frac{\mathrm{d}^2 \mathrm{I}}{\mathrm{dt}^2}\right) \mathrm{dt}=\langle 2 \mathrm{T}\rangle_{\mathrm{t}}-\langle \mathcal{U}\rangle_{\mathrm{t}},\label{1.5.2}\]
and for systems which are stable the left hand side is zero. The first problem arises with the fact that the time average is over infinite time and thus operationally difficult to carryout 1.3. Farquhar17 points out that the time interval must at least be long compared to the relaxation time for the system and in the event that the system crossing time is longer than the relaxation time, the integration in EquationEquation \ref{1.5.2} must exceed that time if any statistical validity is to be maintained in the analysis of the system. It is clear that for stars and star-like objects these conditions are met. However, in stellar dynamics and the analysis of stellar systems they generally are not. Indeed, in this case, the astronomer is in the envious position of being in the reverse position from the thermodynamicists. For all intents and purposes he can perform an 'instantaneous' ensemble average which he wishes to equate to a 'theoretically determined' time average. This interpretation will only be correct if the system is ergodic in the sense of satisfying the 'quasi-ergodic hypothesis'. Pragmatically if the system exhibits a large number of degrees of freedom then persuasive arguments can be made that the equating of time and phase averages is justified. However, if isolating integrals of the motion exist for the system, then it is not justified, as these integrals remove large regions of phase space from the allowable space of the system trajectory. Lewis' theorem allows for ergodicity in a sub-space but then the phase averages must be calculated differently and this correspondence to the observed ensemble average is not clear. Thus, the application of the virial theorem to a system with only a few members and hence a few degrees of freedom is invalid unless care is taken to interpret the observed ensemble averages in light of phase averages altered by the isolating integrals of the motion. Furthermore, one should be most circumspect about applying the virial theorem to large systems like the galaxy which appear to exhibit quasi-isolating integrals of the motion. That is, integrals which appear to restrict the system motion in phase space over several relaxation times. However, for stars and star-like objects exhibiting 1050 or more particles undergoing rapid collisions and having short relaxation times, these concerns do not apply and we may confidently interchange time and phase averages as they appear in the virial theorem. At least we may do it with the same confidence of the thermodynamicist. For those who feel that the ergodic theorem is still "much ado about nothing", it is worth observing that by attempting to provide a rational development between dynamics and thermodynamics, ergodic theory must address itself to the problems of irreversible processes. Since classical dynamics is fully reversible and thermodynamics includes processes which are not, the nature of irreversibility must be connected in some sense to that of ergodicity and thus to the very nature of time itself. Thus, anyone truly interested in the foundations of physics cannot dismiss ergodic theory as mere mathematical 'nit-picking'.


