9.7: Notations for Tensors
Learning Objectives
- Review notations
- Introduce new notations
Johnny is an American grade-school kid who has had his tender mind protected from certain historical realities, such as the political status of slaves, women, and Native Americans in the early United States. If Johnny ever tries to read the U.S. Constitution, he will be very confused by certain passages, such as the infamous three-fifths clause referring opaquely to “all other persons.”
This optional section is meant to expose you to some similar historical ugliness involving tensor notation, knowledge of which may be helpful if you learn general relativity in the future. As in the evolution of the U.S. Constitution and its interpretation, we will find that not all the changes have been improvements. In this section we brieﬂy recapitulate some notations that have already been introduced, and also introduce two new ones.
Concrete index notation
A displacement vector is our prototypical example of a tensor, and the original nineteenth-century approach was to associate this tensor with the changes in the coordinates. Tensors achieve their full importance in differential geometry, where space (or spacetime, in general relativity) may be curved, in the sense defined in section 2.2. In this context, only infinitesimally small displacements qualify as vectors; to see this, imagine displacements on a sphere, which do not commute for the reasons described in section 8.3. On small scales, the sphere’s curvature is not apparent, which is why we need to make our displacements infinitesimal. Thus in this approach, the simplest example of a relativistic tensor occurs if we pick Minkowski coordinates to describe a region of spacetime that is small enough for the curvature to be negligible, and we associate a displacement vector with a \(4\)-tuple of infinitesimal changes in the coordinates:
\[(dt,dx,dy,dz)\]
Until about 1960, this carried the taint of the lack of rigor believed to be associated with Leibniz-style infinitesimal numbers, but this difficulty was resolved and is no longer an argument against the notation. 1
Coordinate-independent notation
A more valid reason for disliking the old-school notation is that, as described in chapter 7, it is desirable to avoid writing every line of mathematics in a notation that explicitly refers to a choice of coordinates. We might therefore prefer, as Penrose began advocating around 1970, to notate this vector in coordinate-independent notation such as “birdtracks” (section 6.1),
\[\rightarrow dx\]
or the synonymous abstract index notation (section ??, p. ??),
\[dx^a\]
where the use of the Latin letter a means that we’re not referring to any coordinate system, a doesn’t take on values such as \(1\) or \(2\), and \(dx^a\) refers to the entire object \( \ rightarrow dx\), not to some real number or set of real numbers.
Unfortunately for the struggling student of relativity, there are at least two more notations now in use, both of them incompatible in various ways with the ones we’ve encountered so far.
Cartan notation
Our notation involving upper and lower indices is descended from a similar-looking one invented in 1853 by Sylvester. 2 In this system, vectors are thought of as invariant quantities. We write a vector in terms of a basis \({e_µ}\) as \(x = \sum x^\mu e_\mu\). Since \(x\) is considered invariant, it follows that the components \(x^µ\) and the basis vectors \(e_µ\) must transform in opposite ways. For example, if we convert from meters to centimeters, the \(x^µ\) get a hundred times bigger, which is compensated for by a corresponding shrinking of the basis vectors by \(1/100\).
This notation clashes with normal index notation in certain ways. One gotcha is that we can’t infer the rank of an expression by counting indices. For example, \(x = \sum x^\mu e_\mu\) is notated as if it were a scalar, but this is actually a notation for a vector.
Circa 1930, Élie Cartan augmented this notation with a trick that is perhaps a little too cute for its own good. He noted that the partial differentiation operators \(∂/∂x^µ\) could be used as a basis for a vector space whose structure is the same as the space of ordinary vectors. In the modern context we rewrite the operator \(∂/∂x^µ\) as \(∂_µ\) and use the Einstein summation convention, so that in the Cartan notation we express a vector in terms of its components as
\[x = x^\mu \partial _\mu\]
In the Cartan notation, the symbol dxµ is hijacked in order to represent something completely different than it normally does; it’s taken to mean the dual vector corresponding to \(∂_µ\). The set \({dx^µ}\) is used as a basis for notating covectors.
A further problem with the Cartan notation arises when we try to use it for dimensional analysis (see below).
Index-free notation
Independently of Penrose and the physics community, mathematicians invented a different coordinate-free notation, one without indices. In this notation, for example, we would notate the magnitude of a vector not as \(v_a v^b\) or \(g_{ab} v^a v^b\) but as
\[g(v,v)\]
This notation is too clumsy for use in complicated expressions involving tensors with many indices. As shown in the next section, it is also not compatible with the way physicists are accustomed to doing dimensional analysis.
Incompatibility of Cartan and index-free notation with dimensional analysis
In
section 9.6 we developed a system of dimensional analysis for use with abstract index notation. Here we discuss the issues that arise when we attempt to mix in other notational systems.One of the hallmarks of index-free notation is that it uses nonmultiplicative notation for many tensor products that would have been written as multiplication in index notation, e.g., \(g(v,v)\) rather than \(v^a v_a\). This makes the system clumsy to use for dimensional analysis, since we are accustomed to reasoning about units based on the assumption that the units of any term in an equation equal the product of the units of its factors.
In Cartan notation we have the problem that certain notations, such as \(dx^µ\), are completely redefined. The remainder of this section is devoted to exploring what goes wrong when we attempt to extend the analysis of section 9.6 to include Cartan notation. Let vector \(r\) and covector \(ω\) be duals of each other, and let \(r\) represent a displacement. In Cartan notation, we write these vectors in terms of their components, in some coordinate system, as follows:
\[r = r^µ∂_µ\]
\[ω = ω_µdx^µ\]
Suppose that the coordinates are Minkowski. Reading from left to right and from top to bottom, there are six quantities occurring in these equations. We attribute to them the units \(L^A, L^B, ...L^F\). If we follow the rule that multiplicative notation is to imply multiplication of units, then
\[A = B + C\]
and
\[D = E + F\]
For compatibility with the system in
section 9.6, equations 9.7.6 and 9.7.7 require\[A + D = 2σ\]
and
\[D = 2γ + B\]
To avoid a clash between Cartan and concrete index notation in a Minkowski coordinate system, it would appear that we want the following three additional conditions.
\[F = ξ \text { units of Cartan } dx^µ \text { not to clash with units of } dx^µ\]
\[C = -ξ \text { units of Cartan } ∂µ \text { not to clash with units of the derivative}\]
\[B = ξ \text { units of components in Cartan notation not to clash with units of } dx^µ\]
We have \(6\) unknowns and \(7\) constraints, so in general Cartan notation cannot be incorporated into this system without some constraint on the exponents \((σ,γ,ξ)\). In particular, we require \(ξ = 0\), which is not a choice that most physicists prefer.