# 1.2: Vector Algebra

- Page ID
- 16515

## Representing Ordinary Vectors

Let’s start all the way back at the beginning with a quick review of vector algebra, to see if we can put the significantly more complicated vector algebra associated with quantum mechanics into a familiar context. We can express a vector in a plane in a number of ways. We can write it as the vector itself (a letter with an arrow over it); as a magnitude multiplied by a directional unit vector; as an expansion in a basis of specific unit vectors (shown below as fixed cartesian unit vectors, but could also be position-variable unit vectors such as those in polar coordinates); and as a row or column matrix:

\[\overrightarrow v = v\widehat v = v_x \widehat i + v_y \widehat j \; \leftrightarrow \; \left[ \begin{array}{*{20}{c}} v_x & v_y \end{array} \right] \;\; or \;\; \left[ \begin{array}{*{20}{c}} v_x \\ v_y \end{array} \right] \]

\[where:\;\;\;\;\; \begin{array}{l} \widehat i \;\leftrightarrow \; \left[ \begin{array}{*{20}{c}} 1 & 0 \end{array} \right] \;\; or \;\; \left[ \begin{array}{*{20}{c}} 1 \\ 0 \end{array} \right] \\ \widehat j \; \leftrightarrow \; \left[ \begin{array}{*{20}{c}} 0 & 1 \end{array} \right] \;\; or \;\; \left[ \begin{array}{*{20}{c}} 0 \\ 1 \end{array} \right] \end{array} \]

Alert

*We will use the double-arrow rather than the equal sign when associating matrix representations with vectors, because matrices are representations within a specific basis. One could reasonably argue that the abstract unit vectors are as well (e.g. any two orthogonal directions can be defined to be the “\(x\)” and “\(y\)” directions), but matrices are yet another step removed. For example, if one draws a vector on a piece of paper, and draws a set of x-y axes as well, then the \(\widehat i\) and \(\widehat j\) directions are defined. The matrix representation of those directions, however, is not defined by the diagram. Any \(2 \times 1\) (or \(1 \times 2\)) matrix with a unit modulus can represent the unit vector in the \(x\)-direction, and any matrix of the same dimensions that is perpendicular to it and has unit modulus can represent the unit vector in the \(y\)-direction. Nevertheless, the matrix representation has some very nice features that makes it more useful than the abstract algebra representation for what is to come.*

While it isn’t clear in this context, the row and column matrices express two different versions of the vector. Typically we define the column matrix as the standard representation of the vector, and the row matrix the representation of its *adjoint*, denoted with a dagger superscript:

\[\overrightarrow v \; \leftrightarrow \; \left[ \begin{array}{*{20}{c}} v_x \\ v_y \end{array} \right], \;\;\;\;\; \overrightarrow v^\dagger \; \leftrightarrow \; \left[ \begin{array}{*{20}{c}} v_x & v_y \end{array} \right] \]

Note that taking an adjoint of an adjoint undoes the original, since the transpose of the transpose of a matrix is the original matrix.

## Inner Products of Ordinary Vectors

These vectors satisfy a scalar (or “inner” or “dot”) product, which consists of multiplying the adjoint of one vector by the other vector:

\[ \overrightarrow v \cdot \overrightarrow w \equiv \overrightarrow v^\dagger \overrightarrow w \; \leftrightarrow \; \left[ \begin{array}{*{20}{c}} v_x & v_y \end{array} \right] \left[ \begin{array}{*{20}{c}} w_x \\ w_y \end{array} \right] = v_x w_x + v_y w_y \]

As a general rule, the inner product of a vector with itself gives its magnitude, which is a positive-definite scalar:

\[ \overrightarrow v \cdot \overrightarrow v \equiv \overrightarrow v^\dagger \overrightarrow v \; \leftrightarrow \; \left[ \begin{array}{*{20}{c}} v_x & v_y \end{array} \right] \left[ \begin{array}{*{20}{c}} v_x \\ v_y \end{array} \right] = v^2_x + v^2_y \]

So far we have been dealing with run-of-the-mill vectors, such as we used in mechanics and E&M. But in anticipation of how we will be using vectors in quantum mechanics, we need to consider an alteration of our definition of the adjoint to maintain our positive-definite square when the components are complex numbers. Specifically, we need to include a complex conjugate with the matrix transpose when taking the adjoint:

\[ \overrightarrow v^\dagger \; \leftrightarrow \; \left[ \begin{array}{*{20}{c}} v^*_x & v^*_y \end{array} \right] \;\;\; \Rightarrow \;\;\; \overrightarrow v^\dagger \overrightarrow v \; \leftrightarrow \; \left[ \begin{array}{*{20}{c}} v^*_x & v^*_y \end{array} \right] \left[ \begin{array}{*{20}{c}} v_x \\ v_y \end{array} \right] = v^*_x v_x + v^*_y v_x = \left|v_x\right|^2 + \left|v_y\right|^2 \]

It's still true that two adjoints of the same vector cancel each other, because not only does the transpose of a transpose result in no change in the matrix, but flipping the sign of the imaginary \(i\) twice returns it to its original value. Note that in light of this change, the inner product given by the equation above needs to be revised:

We can also use the inner product to “project-out” a component of a vector using a scalar product with the unit vector in the direction of that component:

## Resolution of the Identity with Ordinary Vectors

On the vector space there acts an *identity operator*, which is *self-adjoint*, and acts like a scalar “1” when it acts upon any vector or its adjoint:

\[ I=I^\dagger \; \leftrightarrow \; \left[ \begin{array}{*{20}{c}} 1 & 0 \\ 0 & 1 \end{array} \right] \;\;\; \Rightarrow \;\;\; \overrightarrow v^\dagger I = \overrightarrow v^\dagger, \;\;\; I \overrightarrow v = \overrightarrow v \]

A very useful trick that we will employ over and over is called *resolution of the identity*, which consists of multiplying all of the unit vectors in the space by their adjoints (the former on the left and the latter on the right – the opposite of a scalar product), and summing:

## Vectors in Hilbert Space

All of the discussion above involved vectors in a plane, but of course it all applies equally to vectors in our 3-dimensional space as well. While it is hard to visualize in our physical universe, there is no reason (mathematically) to stop there – all the same rules and properties will apply to 4 or 5 dimensions, or for that matter, to an infinite number. We can imagine associating a unit vector with every integer on the number line. If we try using symbols like \(\widehat i\) and \(\widehat j\) for these unit vectors, we quickly run out of letters. An alternate notation involves the use of subscripts, which can be extended indefinitely:

\[ \widehat i = \widehat e_1, \;\; \widehat j = \widehat e_2, \;\; \dots \widehat e_n \;\; \dots\]

If we wish to represent these unit vectors with column matrices, once again the simplest basis is the one where there are 1's in a single entry, and 0's in all the others:

\[ \widehat e_1 \: \leftrightarrow \; \left[ \begin{array}{*{20}{c}} 1 \\ 0 \\ \vdots \\ \vdots \end{array} \right], \;\;\; \widehat e_2 \: \leftrightarrow \; \left[ \begin{array}{*{20}{c}} 0 \\ 1 \\ \vdots \\ \vdots \end{array} \right], \;\;\; \dots \;\;\; \widehat e_n \: \leftrightarrow \; \left[ \begin{array}{*{20}{c}} 0 \\ \vdots \\ 1 \\ \vdots \end{array} \right], \;\;\; \dots\]

A brilliant mathematician named David Hilbert didn't stop here. He extended the idea to not only every integer on a number line, but *every point on a continuous line*. Clearly the indexed notation of \(\widehat e_n\) won't work for this case, nor will be able to represent the unit vectors as matrices whose entries are distinct – a new notation is needed. What is more, we need to accommodate the ideas above of an adjoint, a magnitude, a scalar product, and resolution of the identity.

Instead of describing an arbitrary unit vector using an integer \(n\) from a number line, we include all of the points on the number line between the integers as well, which means we describe the unit vector with a *real number* \(\alpha\) on an axis. Instead of using the "hat" notation we use above for unit vectors, we write the unit vector associated with the position \(\alpha\) as a *ket*:

\[ \alpha^{th}\;unit\;vector:\;\;\left| \;\alpha \;\right> \]

This unit vector doesn't have a matrix representation, because matrices have discrete entries, and this unit vector is defined on a continuum. The adjoint of this unit vector is written as a *bra*:

\[ \left| \;\alpha\; \right>^\dagger = \left<\; \alpha \; \right| \]

Rather than using the "arrow" notation for regular (non-unit) vectors, we also use the ket:

\[ Hilbert\;space\;vector:\;\;\left| \;\Psi \;\right> \]

If we want the component of this vector along the unit vector associated with \(\alpha\), we only need to do what we did in Equation 1.2.8. This component is a scalar function of \(\alpha\) which we designate as follows:

If we take the adjoint of this component, the bra turns into a ket, and the ket into a bra, but the quantity itself is not a matrix (it is a scalar function), so there is no "structural" change. However, recall that the adjoint also includes complex conjugation, and this can be reflected in the component (which in general can be a complex number), so:

\[ \psi^\dagger\left(\alpha\right)= \left<\;\alpha\;|\;\Psi\;\right>^\dagger = \left<\;\Psi\;|\;\alpha\;\right> = \psi^*\left(\alpha\right) \]

## Inner Products of Hilbert Space Vectors

All of this follows very cleanly from what we found with ordinary vectors, but we run into a complication if we wish to take it any further. Suppose we want to take the scalar product of two Hilbert space vectors, and express it in terms of the components:

\[ \left| \;\Psi \;\right> \cdot \left| \;\Phi \;\right> = \left| \;\Psi \;\right>^\dagger \left| \;\Phi \;\right> = \left< \;\Psi \;| \;\Phi \;\right> = \;?\]

We know that *one component* of the vector \(\left|\;\Phi \;\right>\) is \(\phi\left(x\right)\) (the component associated with the point \(x\) – there are infinitely-many of these components), and we know that the corresponding component for the vector \(\left|\;\Psi \;\right>\) is \(\psi\left(x\right)\). We can multiply these components together as we did in Equation 1.2.7, but we have infinitely many more multiplications to do for all the other values of \(x\), and then we need to sum them all together to get the inner product!

We have already seen in previous physics applications (mass distributions for moment of inertia and charge distribution for electrostatics) how to deal with continuous distributions like this. When we are summing an infinite number of continuum-distributed values, we do so using a *density* and an integral. For example, for a line of electric charge stretching from \(\alpha=a\) to \(\alpha=b\) with a density function of \(\lambda\left(\alpha\right)\), the total charge is found using:

\[ Q_{tot} = \int \limits_a^b \lambda\left(\alpha\right) d\alpha \]

In the case of Hilbert space vector components, we define the product \(\psi^*\left(x\right)\phi\left(x\right)\) as the density of the inner product at the point \(\alpha\). To get the full inner product, we need only add up (integrate) all of those contributions. If the \(\alpha\)-axis stretches infinitely-far in both directions, we have:

\[ \left| \;\Psi \;\right> \cdot \left| \;\Phi \;\right> = \left< \;\Psi \;| \;\Phi \;\right> = \int \limits_{-\infty}^{+\infty}\psi^*\left(\alpha\right)\phi\left(\alpha\right)d\alpha \]

## Resolution of the Identity with Hilbert Space Vectors

What about resolving the identity, as we did in Equation 1.2.10? With an infinite number of unit vectors, we once again have a sum of an infinite number of vector products on a continuum. Following our rule of turning the sum into an integral over the \(\alpha\) interval (which we will continue to assume to be from \(-\infty\) to \(+\infty\), and remembering to put the adjoint unit vectors *second*), we get:

\[ I = \int \limits_{-\infty}^{+\infty}\left| \;\alpha \;\right>\left< \;\alpha \;\right|d\alpha \]

We can directly show that these last two equations are consistent with each other by "sandwiching" the identity between the two Hilbert space vectors that we are taking an inner product of. [*If this doesn't make sense, it helps to go back to the matrix representation of ordinary vectors – we are sandwiching a square matrix between a row matrix on the left and a column matrix on the right.]*

\[\left< \;\Psi \;| \;\Phi \;\right> = \left< \;\Psi \;\right| \; I \; \left| \;\Phi \;\right> = \left< \;\Psi \;\right| \left[\int \limits_{-\infty}^{+\infty}\left| \;\alpha \;\right>\left< \;\alpha \;\right|d\alpha \right] \left| \;\Phi \;\right> = \int \limits_{-\infty}^{+\infty}\left<\;\Psi\; | \;\alpha \;\right>\left< \;\alpha \;|\;\Phi\;\right>d \alpha = \int \limits_{-\infty}^{+\infty}\psi^*\left(\alpha\right)\phi\left(\alpha\right)d\alpha\]

And finally, the magnitude-squared of a Hilbert space vector is the inner product of it with itself:

## Delta Functions

We know that unit vectors are mutually orthogonal, and one way to express this is in terms of the *kronecker delta*, which is defined as follows:

\[\delta\left(i,j\right) \equiv \left\{ \begin{array}{l} 1 & i=j \\ 0 & i \ne j \end{array} \right. \]

[Note: This is typically expressed more compactly as simply: \(\delta_{ij}\).]

The orthogonality (and the unit magnitude) of the unit vectors can be expressed be expressed simply using this device:

\[ \widehat e_i = \sum \limits_{all\;j} \widehat e_j \; \delta\left(i,j\right) \]

Notice that the kronecker delta acts as a filter – every value of \(j\) in the sum that is not equal to \(i\) gives a zero contribution to the sum, and the value of \(j\) that equals \(i\) results in a value of one for the delta, giving the proper result. We can see this from another angle, by multiplying a unit vector by the identity:

\[ \widehat e_i = I \; \widehat e_i = \left[\sum \limits_{all\;j} \widehat e_j \widehat e_j^\dagger \right] \widehat e_i = \sum \limits_{all\;j} \widehat e_j \left[\widehat e_j^\dagger \; \widehat e_i \right] \]

Comparing these last two equations, we conclude that:

\[ \delta\left(i,j\right) = \widehat e_j^\dagger \; \widehat e_i \]

For Hilbert space vectors, it works much the same way, with two exceptions. First, the comparison is between two real numbers, not two integers. With the standard vectors, we can compare the integers one-by-one until the answer to the question "Are they the same?" is "Yes." With real numbers, it is easier to simply take the difference between the numbers being compared and ask if that difference is zero. Therefore the *dirac delta* is expressed as a function that looks like: \(\delta\left(\alpha - \alpha '\right)\).

Second, the value of this function is not simply 1 or 0, because the resolution of the identity is an integral rather than a sum:

\[ \left|\;\alpha '\;\right> = I\;\left|\;\alpha '\;\right> = \left[\int \limits_{-\infty}^{+\infty} \left|\;\alpha \;\right> \left<\; \alpha \; \right| d\alpha \right] \left|\;\alpha '\;\right> = \int \limits_{-\infty}^{+\infty} \left|\;\alpha \;\right> \left<\;\alpha \; | \; \alpha '\;\right>d\alpha = \int \limits_{-\infty}^{+\infty} \left|\;\alpha \;\right>\delta\left(\alpha - \alpha '\right) d\alpha \]

This gives us the Hilbert space version of Equation 1.2.27:

\[ \delta\left(\alpha - \alpha '\right) = \left<\;\alpha \;|\;\alpha '\;\right> \]

Notice that if the dirac delta function was equal to 1 when its argument was zero, then every part of this integral would vanish except for one infinitesimal sliver. But since it *must* equal zero everywhere else besides \(\alpha = \alpha '\) for the vectors to be orthogonal, the delta function must contribute an infinite amount at \(\alpha = \alpha '\) to counterbalance the product with the infinitesimal \(d \alpha '\) to give the value 1 needed.

So the dirac delta function is a very strange animal – it is infinitely-tall, and infinitesimally-thin, and the product of the height and width are such that the area under this "curve" is 1.

**Figure 1.2.1 – Dirac Delta Function**

While we will refer back to this crazy function fairly frequently in this text from a conceptual standpoint, we will not use it mathematically very often. Nevertheless, here are two of its many mathematical properties, the first of which is pretty much its defining characteristic:

\[ \delta\left(\alpha - \alpha '\right) = \delta\left(\alpha ' - \alpha\right) \]

Example \(\PageIndex{1}\)

*Use Equation 1.2.29 to prove Equation 1.2.31.*

**Solution**-
*The dirac delta function has no imaginary part, which means that it equals its own complex conjugate:**\[ \delta\left(\alpha - \alpha '\right) = \left[\delta\left(\alpha - \alpha '\right)\right]^* \nonumber \]**Now plug**Equation 1.2.29**into the right side, and use the fact that the complex conjugate flips the bra-ket around:**\[ \delta\left(\alpha - \alpha '\right) = \left<\;\alpha '\;|\;\alpha\;\right>^* = \left<\;\alpha\;|\;\alpha '\;\right> = \delta\left(\alpha '- \alpha\right)\nonumber \]*

Digression: Charge Densities of Point Charges

*In our study of electricity & magnetism, we encountered both point charges and distributions of charge (expressed as a charge density). The two sources for field can be brought together through the dirac delta function (in three dimensions), which is essentially proportional to the charge density of a point charge. That is, we can write a point charge \(q\) at a position \(\overrightarrow r'\) in space as a charge density this way:*

*\[ \rho\left(\overrightarrow r\right) = q \; \delta\left(\overrightarrow r - \overrightarrow r'\right) \nonumber\]*

*It makes sense: This density is infinite at the position of the point charge (the finite charge is crammed into zero volume), is zero everywhere else, and a volume integral of this density performed around the point \(\overrightarrow r'\) results in the enclosed charge \(q\). Note, by the way, that the dirac delta function of a vector is a shorthand for a product of three dirac delta functions:*

*\[ \delta\left(\overrightarrow r - \overrightarrow r'\right) \equiv \delta\left(x- x'\right)\;\delta\left(y- y'\right)\;\delta\left(z- z'\right) \nonumber \]*