9.1: Mathematical Preliminaries
( \newcommand{\kernel}{\mathrm{null}\,}\)
We will start with a theorem on differential forms which is needed to formulate Carathéodory’s version of the second law.
Before proving Carathéodory’s theorem, we will need the following result.
Let A=Aidxi denote a differential one-form. If A∧dA=0, then at least locally, one can find an integrating factor for A; i.e., there exist functions τ and φ such that A=τdφ.
The proof of this result is most easily done inductively in the dimension of the space. First, we consider the two-dimensional case, so that i=1,2. In this case, the condition A∧dA=0 is vacuous. Write A=A1dx1+A2dx2. We make a coordinate transformation to λ, φ where
dx1dλ=−f(x1,x2)A2dx2dλ=f(x1,x2)A1
where f(x1,x2) is an arbitrary function which can be chosen in any convenient way. This equation shows that
A1∂x1∂λ+A2∂x2∂λ=0
Equations ??? define a set of nonintersecting trajectories, λ being the parameter along the trajectory. We choose ϕ as the coordinate on transverse sections of the flow generated by (???). Making the coordinate transformation from x1, x2 to λ, ϕ, we can now write the one-form A as
A=(A1∂x1∂λ+A2∂x2∂λ)dλ+(A1∂x1∂ϕ+A2∂x2∂ϕ)dϕ=τdϕτ=Ai∂xi∂ϕ
This proves the theorem for two dimensions. In three dimensions, we have
A=A1dx1+A2dx2+A3dx3
The strategy is to start by determining τ, ϕ for the A1, A2 subsystem. We choose the new coordinates as λ, ϕ, x3 and impose Equation ???. Solving these, we will find x1 and x2 as functions of λ and x3. The trajectories will also depend on the staring points which may be taken as points on the transverse section and hence labeled by ϕ. Thus we get
x1=x1(λ,ϕ,x3),x2=x2(λ,ϕ,x3)
The one-form A in Equation ??? now becomes
A=(A1∂x1∂λ+A2∂x2∂λ)dλ+(A1∂x1∂ϕ+A2∂x2∂ϕ)dϕ+A3dx3+(A1∂x1∂x3+A2∂x2∂x3)dx3=τdϕ+˜A3dx3˜A3=A3(A1∂x1∂x3+A2∂x2∂x3)
We now consider imposing the equations A∧dA=0,
A∧dA=[˜A3(∂λAϕ−∂ϕAλ)+Aλ(∂ϕ˜A3−∂3Aϕ)+Aϕ(∂3Aλ−∂λ˜A3)]dx3∧dλ∧dϕ=0
Since Aλ=0 and Aϕ=τ from Equation ???, this equation becomes
˜A3∂τ∂λ−τ∂˜A3∂λ=0
Writing ˜A3=τh, this becomes
τ2∂h∂λ=0
Since τ is not identically zero for us, we get ∂h∂λ=0 and, going back to Equation ???, we can write
A=τ[dϕ+h(ϕ,x3)dx3]
The quantity in the square brackets is a one-form on the two-dimensional space defined by ϕ, x3. For this we can use the two-dimensional result and write it as ˜τd˜ϕ, so that
A=ττ[dϕ+h(ϕ,x3)dx3]=τ˜τd˜ϕ≡Td˜ϕ
T=τ˜τ This proves the theorem for the three-dimensional case.
The extension to four dimensions follows a similar pattern. The solutions to Equation ??? become
x1=x1(λ,ϕ,x3,x4),x2=x2(λ,ϕ,x3,x4)
so that we can bring A to the form
A=(A1∂x1∂ϕ+A2∂x2∂ϕ)dϕ+(A3+A1∂x1∂x3+A2∂x2∂x3)dx3+(A4+A1∂x1∂x4+A2∂x2∂x4)dx4=τdϕ+˜A3dx3+˜A4dx4
We now turn to imposing the condition A∧dA=0. In local coordinates this becomes
A_α(∂_µA_{\nu} − ∂_{\nu}A_µ) + A_µ(∂_{\nu}A_α − ∂_αA_{\nu}) + A_{\nu}(∂_αA_µ − ∂_µA_α) = 0
There are four independent conditions here corresponding to (α, µ, \nu) = (1, 2, 3), (4, 1, 2), (3, 4, 1), (3, 2, 4). Using A_λ = 0 and A_ϕ = τ, these four equations become
\widetilde{A}_3 \frac{∂τ}{∂λ} - τ \frac{∂ \widetilde{A}_3}{∂λ} = 0 \label{9.1.15}
\widetilde{A}_4 \frac{∂τ}{∂λ} - τ \frac{∂ \widetilde{A}_4}{∂λ} = 0 \label{9.1.16}
\widetilde{A}_4 \frac{∂ \widetilde{A}_3}{∂λ} - \widetilde{A}_3 \frac{∂ \widetilde{A}_4}{∂λ} = 0 \label{9.1.17}
\widetilde{A}_3 \frac{∂ \widetilde{A}_4}{∂ϕ} - \widetilde{A}_4 \frac{∂ \widetilde{A}_3}{∂ϕ} + τ \frac{∂ \widetilde{A}_3}{∂x^4} - \widetilde{A}_3 \frac{∂τ}{∂x^4} + \widetilde{A}_4 \frac{∂τ}{∂x^3} - τ \frac{∂ \widetilde{A}_4}{∂x^3} = 0 \label{9.1.18}
Again, we introduce h and g by \widetilde{A}_3 = τ h\), \widetilde{A}_4 = τ g. Then, equations (\ref{9.1.15}) and (\ref{9.1.16}) become
\frac{\partial h}{\partial λ} = 0,\;\;\;\;\;\; \frac{\partial g}{\partial λ} = 0 \label{9.1.19}
Equation \ref{9.1.17} is then identically satisfied. The last equation, namely, \ref{9.1.18}, simplifies to
h \frac{\partial g}{\partial ϕ} - g \frac{\partial h}{\partial ϕ} + \frac{\partial h}{\partial x^4} - \frac{\partial g}{\partial x^3} = 0 \label{9.1.20}
Using these results, Equation \ref{9.1.13} becomes
A = = [τ dϕ + hdx^3 + gdx^4] \label{9.1.21}
The quantity in the square brackets is a one-form on the three-dimensional space of ϕ, x^3, x^4 and we can use the previous result for an integrating factor for this. The condition for the existence of an integrating factor for dϕ + hdx^3 + gdx^4 is precisely \ref{9.1.20}. Thus if we have Equation \ref{9.1.20}, we can write dϕ + hdx^3 + gdx^4 as tds for some functions t and s, so that finally A takes the form A = T dS. Thus the theorem is proved for four dimensions. The procedure can be extended to higher dimensions recursively, establishing the theorem for all dimensions.
Now we turn to the basic theorem needed for the Carathéodory formulation. Consider an n-dimensional manifold M with a one-form A on it. A solution curve to A is defined by A = 0 along the curve. Explicitly, the curve may be taken as given by a set of function x^i = ξ^i (t) where t is the parameter along the curve and
A_i \frac{dx^i}{dt} = A_i \dot{ξ}^i = 0 \label{9.1.22}
In other words, the tangent vector to the curve is orthogonal to A_i. The curve therefore lies on an (n − 1)-dimensional surface. Two points, say, P and P' on M are said to be A accessible if there is a solution curve which contains P and P'. Carathéodory’s theorem is the following:
If in the neighborhood of a point P there are A-inaccessible points, then A admits an integrating factor; i.e., A = T dS where T and S are well defined functions in the neighborhood.
The proof of the theorem involves a reductio ad absurdum argument which constructs paths connecting P to any other point in the neighborhood. (This proof is due to H.A. Buchdahl, Proc. Camb. Phil. Soc. 76, 529 (1979).) For this, define
C_{ijk} = A_i(∂_jA_k − ∂_kA_j) + A_k(∂_iA_j − ∂_jA_i) + A_j (∂_kA_i − ∂_iA_k) \label{9.1.23}
Now consider a point P' near P We have a displacement vector \epsilonη^i for the coordinates of P' (from P). η^i can in general have a component along A_i and some components orthogonal to A_i. The idea is to solve for these from the equation A = 0. Let ξ^i (t) be a path which begins and ends at P, i.e., ξ^i (0) = ξ^i (1) = 0, 0 ≤ t ≤ 1, and which is orthogonal to A_i. Thus it is a solution curve. Any closed curve starting at P and lying in the (n − 1)-dimensional space orthogonal to A_i can be chosen. Consider now a nearby path given by x^i (t) = ξ^i (t) + \epsilon η^i (t). This will also be a solution curve if A_I (ξ + \epsilon η)(\dot{ξ} + \epsilon \dot{η})^i = 0. Expanding to first order in \epsilon, this is equivalent to
A_i \dot{η}^i + \dot{ξ}^i \left( \frac{\partial A_i}{\partial x^j} \right) η^j = 0 \label{9.1.24}
where we also used A_i \dot{ξ}^i = 0. We may choose \dot{ξ}^i to be of the form \dot{ξ}^i = f^{ij}A_j where f^{ij} is antisymmetric, to be consistent with A_i \dot{ξ}^i = 0. We can find quantities f^{ij} such that this is true; in any case, it is sufficient to show one path which makes P' accessible. So we may consider \dot{ξ}^i’s of this form. Thus Equation \ref{9.1.24} becomes
A_i \dot{η}^i + η^j (∂_jA_i)f^{ik}A_k = 0
This is one equation for the n components of the displacement η^i. We can choose the n − 1 components of η^i which are orthogonal to A_i as we like and view this equation as determining the remaining component, the one along A_i. So we rewrite this equation as an equation for A_iη^i as follows.
\begin{equation} \begin{split} \frac{d}{dt} (A_iη^i) & = \dot{A}_iη^i + A_i \dot{η}^i \\[0.125in] & = (∂_jA_i) \dot{ξ}^j η^i − η^j (∂_jA_i)f^{ik}A_k) \\[0.125in] & = −η^i f^{jk}(∂_iA_j − ∂_jA_i)A_k \\[0.125in] & = \frac{1}{2} η^i f^{jk} [A_k(∂_iA_j − ∂_jA_i) + A_j (∂_kA_i − ∂_iA_k) + A_i(∂_jA_k − ∂_kA_j)] + \frac{1}{2} (A · η)f^{jk}(∂_jA_k − ∂_kA_j ) \\[0.125in] & = \frac{1}{2} η^i f^{jk}C_{kij} + \frac{1}{2} (A · η)f^{ij} (∂_iA_j − ∂_jA_i) \end{split} \end{equation} \label{9.1.26}
This can be rewritten as
\frac{d}{dt} (A · η) - F (A · η) = -\frac{1}{2} (C_{kij}η^i f^{jk}) \label{9.1.27}
where F = \frac{1}{2} f^{ij} (∂_iA_j − ∂_jA_i). The important point is that we can choose f^{ij}, along with a coordinate transformation if needed, such that C_{kij}η^i f^{jk} has no component along A_i. For this, notice that
C_{kij}η^i f^{jk} A_i = A^2F_{ij} − A_iA_kF_{kj} + A_jA_kF_{ki} f^{ij} \label{9.1.28}
where F_{ij} = ∂_iA_j − ∂_jA_i. There are \frac{1}{2} n(n − 1) components for f^{ij}, for which we have one equation if we set C_{kij}η^i f^{jk} A_i to zero. We can always find a solution; in fact, there are many solutions. Making this choice, C_{kij}η^i f^{jk} has no component along A_i, so the components of η on the right hand side of Equation \ref{9.1.27} are orthogonal to A_i. As mentioned earlier, there is a lot of freedom in how these components of η are chosen. Once they are chosen, we can integrate Equation \ref{9.1.27} to get (A · η), the component along A_i. Integrating Equation \ref{9.1.27}, we get
A · η(1) = \int_0^1 dt \text{ exp} \left( \int_t^1 dt' F(t') \right) \left( \frac{1}{2} C_{kij}η^i f^{jk} \right)
We have chosen η(0) = 0. It is important that the right-hand side of Equation \ref{9.1.27} does not involve (A · η) for us to be able to integrate like this. We choose all components of η^i orthogonal to A_i to be such that
\epsilon η^i = \text{ coordinates of } P' \text{ orthogonal to } A \label{9.1.30}
We then choose f^{jk}, if needed by scaling it, such that A · η(1) in Equation \ref{9.1.30} gives A_i(x_{P'} − x_P)^i. We have thus shown that we can always access P' along a solution curve. The only case where the argument would fail is when C_{ijk} = 0. In this case, A · η(1) as calculated is zero and we have no guarantee of matching the component of the displacement of P' along the direction of A_i. Thus if there are inaccessible points in the neighborhood of P, then we must have C_{ijk} = 0. In this case, by the previous theorem, A admits an integrating factor and we can write A = T dS for some functions T and S in the neighborhood of P. This completes the proof of the Carathéodory theorem.