Skip to main content
Physics LibreTexts

9.1: Mathematical Preliminaries

  • Page ID
    32048
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    We will start with a theorem on differential forms which is needed to formulate Carathéodory’s version of the second law.

    Before proving Carathéodory’s theorem, we will need the following result.

    Theorem \(\PageIndex{1}\) — Integrating Factor Theorem

    Let \(A = A_idx^i\) denote a differential one-form. If \(A ∧ dA = 0\), then at least locally, one can find an integrating factor for \(A\); i.e., there exist functions \(τ\) and \(φ\) such that \(A = τ\;dφ\).

    The proof of this result is most easily done inductively in the dimension of the space. First, we consider the two-dimensional case, so that \(i = 1, 2\). In this case, the condition \(A ∧ dA = 0\) is vacuous. Write \(A = A_1dx^1 + A_2dx^2\). We make a coordinate transformation to \(λ\), \(φ\) where

    \[ \begin{equation}
    \begin{split}
    \frac{dx^1}{dλ} & = -f(x^1,x^2)A_2 \\[0.125in]
    \frac{dx^2}{dλ} & = f(x^1,x^2)A_1
    \end{split}
    \end{equation} \label{9.1.1} \]

    where \(f(x^1 , x^2)\) is an arbitrary function which can be chosen in any convenient way. This equation shows that

    \[A_1 \frac{\partial x^1}{\partial λ} + A_2 \frac{\partial x^2}{\partial λ} = 0 \]

    Equations \ref{9.1.1} define a set of nonintersecting trajectories, λ being the parameter along the trajectory. We choose \(ϕ\) as the coordinate on transverse sections of the flow generated by (\ref{9.1.1}). Making the coordinate transformation from \(x^1\), \(x^2\) to \(λ\), \(ϕ\), we can now write the one-form \(A\) as

    \[ \begin{equation}
    \begin{split}
    A & = \left( A_1 \frac{\partial x^1}{\partial λ} + A_2 \frac{\partial x^2}{\partial λ} \right) dλ + \left( A_1 \frac{\partial x^1}{\partial \phi} + A_2 \frac{\partial x^2}{\partial \phi} \right) d \phi \\[0.125in]
    & = τ\;d \phi \\[0.125in] τ & = A_i \frac{\partial x^i}{\partial \phi}
    \end{split}
    \end{equation} \label{9.1.3} \]

    This proves the theorem for two dimensions. In three dimensions, we have

    \[A = A_1dx^1 + A_2dx^2 + A_3dx^3 \label{9.1.4} \]

    The strategy is to start by determining \(τ\), \(ϕ\) for the \(A_1\), \(A_2\) subsystem. We choose the new coordinates as \(λ\), \(ϕ\), \(x^3\) and impose Equation \ref{9.1.1}. Solving these, we will find \(x^1\) and \(x^2\) as functions of \(λ\) and \(x^3\). The trajectories will also depend on the staring points which may be taken as points on the transverse section and hence labeled by \(ϕ\). Thus we get

    \[x^1 = x^1 (λ, ϕ, x^3 ), x^2 = x^2 (λ, ϕ, x^3) \]

    The one-form \(A\) in Equation \ref{9.1.4} now becomes

    \[ \begin{equation}
    \begin{split}
    A & = \left( A_1 \frac{\partial x^1}{\partial λ} + A_2 \frac{\partial x^2}{\partial λ} \right) dλ + \left( A_1 \frac{\partial x^1}{\partial \phi} + A_2 \frac{\partial x^2}{\partial \phi} \right) d \phi + A_3dx^3 + \left( A_1 \frac{\partial x^1}{\partial x^3} + A_2 \frac{\partial x^2}{\partial x^3} \right) dx^3 \\[0.125in]
    & = τ\;d \phi + \widetilde{A}_3dx^3 \\[0.125in] \widetilde{A}_3 & = A_3 \left( A_1 \frac{\partial x^1}{\partial x^3} + A_2 \frac{\partial x^2}{\partial x^3} \right)
    \end{split}
    \end{equation} \label{9.1.6} \]

    We now consider imposing the equations \(A ∧ dA = 0\),

    \[ \begin{equation}
    \begin{split}
    A ∧ dA & = \left[ \widetilde{A}_3(∂_λA_ϕ − ∂_ϕA_λ) + A_λ(∂_ϕ \widetilde{A}_3 − ∂_3A_ϕ) + A_ϕ(∂_3A_λ − ∂_λ \widetilde{A}_3) \right] dx^3 ∧ dλ ∧ dϕ \\[0.125in]
    & = 0
    \end{split}
    \end{equation} \label{9.1.7} \]

    Since \(A_λ = 0\) and \(A_ϕ = τ\) from Equation \ref{9.1.6}, this equation becomes

    \[ \widetilde{A}_3 \frac{\partial τ}{\partial λ} - τ \frac{\partial \widetilde{A}_3}{\partial λ} = 0 \label{9.1.8} \]

    Writing \(\widetilde{A}_3 = τ\;h\), this becomes

    \[τ^2 \frac{\partial h}{\partial λ} = 0 \label{9.1.9} \]

    Since \(τ\) is not identically zero for us, we get \(\frac{∂h}{∂λ} = 0\) and, going back to Equation \ref{9.1.6}, we can write

    \[ A = τ \left[ dϕ + h(ϕ, x^3) dx^3 \right] \label{9.1.10} \]

    The quantity in the square brackets is a one-form on the two-dimensional space defined by \(ϕ\), \(x^3\). For this we can use the two-dimensional result and write it as \(\widetilde{τ} d \widetilde{ϕ}\), so that

    \[A = τ τ [dϕ + h(ϕ, x^3) dx^3] = τ \widetilde{τ} d \widetilde{ϕ} ≡ T d\widetilde{ϕ}\]

    \(T = τ \widetilde{τ}\) This proves the theorem for the three-dimensional case.

    The extension to four dimensions follows a similar pattern. The solutions to Equation \ref{9.1.1} become

    \[ x^1 = x^1 (λ, ϕ, x^3, x^4),\;\;\;\;\;\;\; x^2 = x^2 (λ, ϕ, x^3, x^4) \]

    so that we can bring \(A\) to the form

    \[ \begin{equation}
    \begin{split}
    A & = \left( A_1 \frac{\partial x^1}{\partial ϕ} + A_2 \frac{\partial x^2}{\partial ϕ} \right) dϕ + \left( A_3 + A_1 \frac{\partial x^1}{\partial x^3} + A_2 \frac{\partial x^2}{\partial x^3} \right) d x^3 + \left( A_4 + A_1 \frac{\partial x^1}{\partial x^4} + A_2 \frac{\partial x^2}{\partial x^4} \right) dx^4 \\[0.125in]
    & = τ\;d ϕ + \widetilde{A}_3dx^3 + \widetilde{A}_4dx^4
    \end{split}
    \end{equation} \label{9.1.13} \]

    We now turn to imposing the condition \(A ∧ dA = 0\). In local coordinates this becomes

    \[A_α(∂_µA_{\nu} − ∂_{\nu}A_µ) + A_µ(∂_{\nu}A_α − ∂_αA_{\nu}) + A_{\nu}(∂_αA_µ − ∂_µA_α) = 0\]

    There are four independent conditions here corresponding to \((α, µ, \nu) = (1, 2, 3), (4, 1, 2), (3, 4, 1), (3, 2, 4)\). Using \(A_λ = 0\) and \(A_ϕ = τ\), these four equations become

    \[\widetilde{A}_3 \frac{∂τ}{∂λ} - τ \frac{∂ \widetilde{A}_3}{∂λ} = 0 \label{9.1.15}\]

    \[\widetilde{A}_4 \frac{∂τ}{∂λ} - τ \frac{∂ \widetilde{A}_4}{∂λ} = 0 \label{9.1.16}\]

    \[\widetilde{A}_4 \frac{∂ \widetilde{A}_3}{∂λ} - \widetilde{A}_3 \frac{∂ \widetilde{A}_4}{∂λ} = 0 \label{9.1.17}\]

    \[\widetilde{A}_3 \frac{∂ \widetilde{A}_4}{∂ϕ} - \widetilde{A}_4 \frac{∂ \widetilde{A}_3}{∂ϕ} + τ \frac{∂ \widetilde{A}_3}{∂x^4} - \widetilde{A}_3 \frac{∂τ}{∂x^4} + \widetilde{A}_4 \frac{∂τ}{∂x^3} - τ \frac{∂ \widetilde{A}_4}{∂x^3} = 0 \label{9.1.18}\]

    Again, we introduce \(h\) and \(g\) by \widetilde{A}_3 = τ h\), \(\widetilde{A}_4 = τ g\). Then, equations (\ref{9.1.15}) and (\ref{9.1.16}) become

    \[\frac{\partial h}{\partial λ} = 0,\;\;\;\;\;\; \frac{\partial g}{\partial λ} = 0 \label{9.1.19} \]

    Equation \ref{9.1.17} is then identically satisfied. The last equation, namely, \ref{9.1.18}, simplifies to

    \[h \frac{\partial g}{\partial ϕ} - g \frac{\partial h}{\partial ϕ} + \frac{\partial h}{\partial x^4} - \frac{\partial g}{\partial x^3} = 0 \label{9.1.20} \]

    Using these results, Equation \ref{9.1.13} becomes

    \[A = = [τ dϕ + hdx^3 + gdx^4] \label{9.1.21}\]

    The quantity in the square brackets is a one-form on the three-dimensional space of \(ϕ, x^3, x^4\) and we can use the previous result for an integrating factor for this. The condition for the existence of an integrating factor for \(dϕ + hdx^3 + gdx^4\) is precisely \ref{9.1.20}. Thus if we have Equation \ref{9.1.20}, we can write \(dϕ + hdx^3 + gdx^4\) as \(tds\) for some functions \(t\) and \(s\), so that finally \(A\) takes the form \(A = T dS\). Thus the theorem is proved for four dimensions. The procedure can be extended to higher dimensions recursively, establishing the theorem for all dimensions.

    Now we turn to the basic theorem needed for the Carathéodory formulation. Consider an n-dimensional manifold \(M\) with a one-form \(A\) on it. A solution curve to \(A\) is defined by \(A = 0\) along the curve. Explicitly, the curve may be taken as given by a set of function \(x^i = ξ^i (t)\) where \(t\) is the parameter along the curve and

    \[A_i \frac{dx^i}{dt} = A_i \dot{ξ}^i = 0 \label{9.1.22} \]

    In other words, the tangent vector to the curve is orthogonal to A_i. The curve therefore lies on an \((n − 1)\)-dimensional surface. Two points, say, \(P\) and \(P'\) on \(M\) are said to be \(A\) accessible if there is a solution curve which contains \(P\) and \(P'\). Carathéodory’s theorem is the following:

    Theorem \(\PageIndex{2}\) — Carathéodory’s Theorem.

    If in the neighborhood of a point \(P\) there are \(A\)-inaccessible points, then \(A\) admits an integrating factor; i.e., \(A = T dS\) where \(T\) and \(S\) are well defined functions in the neighborhood.

    The proof of the theorem involves a reductio ad absurdum argument which constructs paths connecting \(P\) to any other point in the neighborhood. (This proof is due to H.A. Buchdahl, Proc. Camb. Phil. Soc. 76, 529 (1979).) For this, define

    \[ C_{ijk} = A_i(∂_jA_k − ∂_kA_j) + A_k(∂_iA_j − ∂_jA_i) + A_j (∂_kA_i − ∂_iA_k) \label{9.1.23} \]

    Now consider a point \(P'\) near \(P\) We have a displacement vector \( \epsilonη^i\) for the coordinates of \(P'\) (from \(P\)). \(η^i\) can in general have a component along \(A_i\) and some components orthogonal to \(A_i\). The idea is to solve for these from the equation \(A = 0\). Let \(ξ^i (t)\) be a path which begins and ends at \(P\), i.e., \(ξ^i (0) = ξ^i (1) = 0\), \(0 ≤ t ≤ 1\), and which is orthogonal to \(A_i\). Thus it is a solution curve. Any closed curve starting at \(P\) and lying in the \((n − 1)\)-dimensional space orthogonal to \(A_i\) can be chosen. Consider now a nearby path given by \(x^i (t) = ξ^i (t) + \epsilon η^i (t)\). This will also be a solution curve if \(A_I (ξ + \epsilon η)(\dot{ξ} + \epsilon \dot{η})^i = 0\). Expanding to first order in \(\epsilon\), this is equivalent to

    \[A_i \dot{η}^i + \dot{ξ}^i \left( \frac{\partial A_i}{\partial x^j} \right) η^j = 0 \label{9.1.24}\]

    where we also used \(A_i \dot{ξ}^i = 0\). We may choose \(\dot{ξ}^i\) to be of the form \(\dot{ξ}^i = f^{ij}A_j\) where \(f^{ij}\) is antisymmetric, to be consistent with \(A_i \dot{ξ}^i = 0\). We can find quantities \(f^{ij}\) such that this is true; in any case, it is sufficient to show one path which makes \(P'\) accessible. So we may consider \(\dot{ξ}^i\)’s of this form. Thus Equation \ref{9.1.24} becomes

    \[ A_i \dot{η}^i + η^j (∂_jA_i)f^{ik}A_k = 0 \]

    This is one equation for the \(n\) components of the displacement \(η^i\). We can choose the \(n − 1\) components of \(η^i\) which are orthogonal to \(A_i\) as we like and view this equation as determining the remaining component, the one along \(A_i\). So we rewrite this equation as an equation for \(A_iη^i\) as follows.

    \[ \begin{equation}
    \begin{split}
    \frac{d}{dt} (A_iη^i) & = \dot{A}_iη^i + A_i \dot{η}^i \\[0.125in]
    & = (∂_jA_i) \dot{ξ}^j η^i − η^j (∂_jA_i)f^{ik}A_k) \\[0.125in]
    & = −η^i f^{jk}(∂_iA_j − ∂_jA_i)A_k \\[0.125in]
    & = \frac{1}{2} η^i f^{jk} [A_k(∂_iA_j − ∂_jA_i) + A_j (∂_kA_i − ∂_iA_k) + A_i(∂_jA_k − ∂_kA_j)] + \frac{1}{2} (A · η)f^{jk}(∂_jA_k − ∂_kA_j ) \\[0.125in]
    & = \frac{1}{2} η^i f^{jk}C_{kij} + \frac{1}{2} (A · η)f^{ij} (∂_iA_j − ∂_jA_i)
    \end{split}
    \end{equation} \label{9.1.26} \]

    This can be rewritten as

    \[\frac{d}{dt} (A · η) - F (A · η) = -\frac{1}{2} (C_{kij}η^i f^{jk}) \label{9.1.27} \]

    where \(F = \frac{1}{2} f^{ij} (∂_iA_j − ∂_jA_i)\). The important point is that we can choose \(f^{ij}\), along with a coordinate transformation if needed, such that \( C_{kij}η^i f^{jk} \) has no component along \(A_i\). For this, notice that

    \[ C_{kij}η^i f^{jk} A_i = A^2F_{ij} − A_iA_kF_{kj} + A_jA_kF_{ki} f^{ij} \label{9.1.28}\]

    where \(F_{ij} = ∂_iA_j − ∂_jA_i\). There are \(\frac{1}{2} n(n − 1)\) components for \(f^{ij}\), for which we have one equation if we set \( C_{kij}η^i f^{jk} A_i\) to zero. We can always find a solution; in fact, there are many solutions. Making this choice, \( C_{kij}η^i f^{jk} \) has no component along \(A_i\), so the components of \(η\) on the right hand side of Equation \ref{9.1.27} are orthogonal to \(A_i\). As mentioned earlier, there is a lot of freedom in how these components of \(η\) are chosen. Once they are chosen, we can integrate Equation \ref{9.1.27} to get \((A · η)\), the component along \(A_i\). Integrating Equation \ref{9.1.27}, we get

    \[A · η(1) = \int_0^1 dt \text{ exp} \left( \int_t^1 dt' F(t') \right) \left( \frac{1}{2} C_{kij}η^i f^{jk} \right) \]

    We have chosen \(η(0) = 0\). It is important that the right-hand side of Equation \ref{9.1.27} does not involve \((A · η)\) for us to be able to integrate like this. We choose all components of \(η^i\) orthogonal to \(A_i\) to be such that

    \[ \epsilon η^i = \text{ coordinates of } P' \text{ orthogonal to } A \label{9.1.30} \]

    We then choose \(f^{jk}\), if needed by scaling it, such that \(A · η(1)\) in Equation \ref{9.1.30} gives \(A_i(x_{P'} − x_P)^i\). We have thus shown that we can always access \(P'\) along a solution curve. The only case where the argument would fail is when \(C_{ijk} = 0\). In this case, \(A · η(1)\) as calculated is zero and we have no guarantee of matching the component of the displacement of \(P'\) along the direction of \(A_i\). Thus if there are inaccessible points in the neighborhood of \(P\), then we must have \(C_{ijk} = 0\). In this case, by the previous theorem, \(A\) admits an integrating factor and we can write \(A = T dS\) for some functions \(T\) and \(S\) in the neighborhood of \(P\). This completes the proof of the Carathéodory theorem.


    This page titled 9.1: Mathematical Preliminaries is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by V. Parameswaran Nair via source content that was edited to the style and standards of the LibreTexts platform.