3.2: Matrices

Last updated
Save as PDF

Page ID: 34359

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

It is very useful to rewrite equation (3.14) in a matrix notation. Because of the linearity of the equations of motion for harmonic motion, it will be very useful to have the tools of linear algebra at hand for our study of wave phenomena. If you haven’t studied linear algebra (or didn’t understand much of it) in math courses, DON’T PANIC. We will start from scratch by describing the properties of matrices and matrix multiplication. The important thing to keep in mind is that matrices are nothing very deep or magical. They are just bookkeeping devices designed to make your life easier when you deal with more than one equation at a time.

A matrix is a rectangular array of numbers. An \(N \times M\) matrix has \(N\) rows and \(M\) columns. Matrices can be added and subtracted simply by adding and subtracting each of the components. The difference comes in multiplication. It is very convenient to define a multiplication law that defines the product of an \(N \times M\) matrix on the left with a \(M \times L\) matrix on the right (the order is important!) to be an \(N \times L\) matrix as follows:

Call the \(N \times M\) matrix \(A\) and let \(A_{jk}\) be the number in the \(j\)th row and \(k\)th column for \(1 \leq j \leq N\) and \(1 \leq k \leq M\). These individual components of the matrix are called matrix elements. In terms of its matrix elements, the matrix \(A\) looks like: \[A=\left(\begin{array}{cccc}
A_{11} & A_{12} & \cdots & A_{1 M} \\
A_{21} & A_{22} & \cdots & A_{2 M} \\
\vdots & \vdots & \ddots & \vdots \\
A_{N 1} & A_{N 2} & \cdots & A_{N M}
\end{array}\right) .\]

Call the \(M \times L\) matrix \(B\) with matrix elements \(B_{kl}\) for \(1 \leq k \leq M\) and \(1 \leq l \leq L\): \[B=\left(\begin{array}{cccc}
B_{11} & B_{12} & \cdots & B_{1 L} \\
B_{21} & B_{22} & \cdots & B_{2 L} \\
\vdots & \vdots & \ddots & \vdots \\
B_{M 1} & B_{M 2} & \cdots & B_{M L}
\end{array}\right) .\]

Call the \(N \times L\) matrix \(C\) with matrix elements \(C_{jl}\) for \(1 \leq j \leq N\) and \(1 \leq l \leq L\). \[C=\left(\begin{array}{cccc}
C_{11} & C_{12} & \cdots & C_{1 L} \\
C_{21} & C_{22} & \cdots & C_{2 L} \\
\vdots & \vdots & \ddots & \vdots \\
C_{N 1} & C_{N 2} & \cdots & C_{N L}
\end{array}\right) .\]

Then the matrix \(C\) is defined to be the product matrix \(A B\) if \[C_{j l}=\sum_{k=1}^{M} A_{j k} \cdot B_{k l} .\]

Equation (3.23) is the algebraic statement of the “row-column” rule. To compute the \(j \ell\) matrix element of the product matrix, \(AB\), take the \(j\)th row of the matrix \(A\) and the \(\ell\)th column of the matrix \(B\) and form their dot-product (corresponding to the sum over \(k\) in (3.23)). This rule is illustrated below: \[\left(\begin{array}{ccccc}
A_{11} & \cdots & A_{1 k} & \cdots & A_{1 M} \\
\vdots & \ddots & \vdots & \ddots & \vdots \\
\hline A_{j 1} & \cdots & A_{j k} & \cdots & A_{j M} \\
\hline \vdots & \ddots & \vdots & \ddots & \vdots \\
\Lambda_{N 1} & \cdots & \Lambda_{N k} & \cdots & \Lambda_{N M}
\end{array}\right)\left(\begin{array}{cc|c|cc}
B_{11} & \cdots & B_{1 \ell} & \cdots & B_{1 L} \\
\vdots & \ddots & \vdots & \ddots & \vdots \\
B_{k 1} & \cdots & B_{k \ell} & \cdots & B_{k L} \\
\vdots & \ddots & \vdots & \ddots & \vdots \\
B_{M 1} & \cdots & B_{M \ell} & \cdots & B_{M L}
\end{array}\right)\]

\[=\left(\begin{array}{ccccc}
C_{11} & \cdots & C_{1 \ell} & \cdots & C_{1 L} \\
\vdots & \ddots & \vdots & \ddots & \vdots \\
C_{j 1} & \cdots & C_{j \ell} & \cdots & C_{j L} \\
\vdots & \ddots & \vdots & \ddots & \vdots \\
C_{N 1} & \cdots & C_{N \ell} & \cdots & C_{N L}
\end{array}\right) .\]

For example, \[\left(\begin{array}{cc}
2 & 3 \\
0 & 1 \\
2 & -1
\end{array}\right) \cdot\left(\begin{array}{lll}
1 & 0 & 2 \\
0 & 1 & 3
\end{array}\right)=\left(\begin{array}{ccc}
2 & 3 & 13 \\
0 & 1 & 3 \\
2 & -1 & 1
\end{array}\right) .\]

It is easy to check that the matrix product defined in this way is associative, \((AB)C = A(BC)\). However, in general, it is not commutative, \(A B \neq B A\). In fact, if the matrices are not square, the product in the opposite order may not even make any sense! The matrix product \(AB\) only makes sense if the number of columns of \(A\) is the same as the number of rows of \(B\). Beware!

Except for the fact that it is not commutative, matrix multiplication behaves very much like ordinary multiplication. For example, there are “identity” matrices. The \(N \times N\) identity matrix, called \(I\), has zeros everywhere except for 1’s down the diagonal. For example, the \(3 \times 3\) identity matrix is \[I=\left(\begin{array}{lll}
1 & 0 & 0 \\
0 & 1 & 0 \\
0 & 0 & 1
\end{array}\right) .\]

The \(N \times N\) identity matrix satisfies \[\begin{array}
I A=A I=A \text { for any } N \times N \text { matrix } A \\
I B=B \text { for any } N \times M \text { matrix } B \text { ; } \\
C I=C \text { for any } M \times N \text { matrix } C \text { . }
\end{array}\]

We will be primarily concerned with “square” (that is \(N \times N\)) matrices.

Matrices allow us to deal with many linear equations at the same time.

An \(N\) dimensional column vector can be regarded as an \(N \times 1\) matrix. We will call this object an “\(N\)-vector.” It should not be confused with a coordinate vector in three-dimensional space. Likewise, we can think of an \(N\) dimensional row vector as a \(1 \times N\)0 matrix. Matrix multiplication can also describe the product of a matrix with a vector to give a vector. The particularly important case that we will need in order to analyze wave phenomena involves square matrices. Consider an \(N \times N\) matrix \(A\) multiplying an \(N\)-vector, \(X\), to give another \(N\)-vector, \(F\). The square matrix \(A\) has \(N^{2}\) matrix elements, \(A_{jk}\) for \(j\) and \(k = 1\) to \(N\). The vectors \(X\) and \(F\) each have \(N\) matrix elements, just their components \(X_{j}\) and \(F_{j}\) for \(j = 1\) to \(N\). Then the matrix equation: \[A X-F\]

actually stands for \(N\) equations: \[\sum_{k=1}^{N} A_{j k} \cdot X_{k}=F_{j}\]

for \(j=1\) to \(N\). In other words, these are \(N\) simultaneous linear equations for the \(N\) \(X_{j}\)’s. You all know, from your studies of algebra how to solve for the \(X_{j}\)’s in terms of the \(F_{j}\)’s and the \(A_{jk}\)’s but it is very useful to do it in matrix notation. Sometimes, we can find the “inverse” of the matrix \(A\), \(A^{-1}\), which has the property \[A A^{-1}=A^{-1} A=I ,\]

where \(I\) is the identity matrix discussed in (3.26) and (3.27). If we can find such a matrix, then the \(N\) simultaneous linear equations, (3.29), have a unique solution that we can write in a very compact form. Multiply both sides of (3.29) by \(A^{-1}\). On the left-hand side, we can use (3.30) and (3.27) to get rid of the \(A^{-1}A\\) and write the solution as follows: \[X=A^{-1} F .\]

Inverse and Determinant

We can compute \(A^{-1}\) in terms of the “determinant” of \(A\). The determinant of the matrix \(A\) is a sum of products of the matrix elements of \(A\) with the following properties:

There are \(N!\) terms in the sum;
Each term in the sum is a product of \(N\) different matrix elements;
In each product, every row number and every column number appears exactly once;
Every such product can be obtained from the product of the diagonal elements, \(A_{11} A_{22} \cdots A_{N N}\), by a sequence of interchanges of the column labels. For example, \(A_{12} A_{21} A_{33} \cdots A_{N N}\) involves one interchange while \(A_{12} A_{23} A_{31} A_{44} \cdots A_{N N}\) requires two.
The coefficient of a product in the determinant is +1 if it involves an even number of interchanges and −1 if it involves an odd number of interchanges.

Thus the determinant of a \(2 \times 2\) matrix, \(A\) is \[\operatorname{det} A=A_{11} A_{22}-A_{12} A_{21} .\]

The determinant of a \(3 \times 3\) matrix, \(A\) is \[\begin{gathered}
\operatorname{det} A=A_{11} A_{22} A_{33}+A_{12} A_{23} A_{31}+A_{13} A_{21} A_{32} \\
-A_{11} A_{23} A_{32}-A_{13} A_{22} A_{31}-A_{12} A_{21} A_{33} .
\end{gathered}\]

Unless you are very unlucky, you will never have to compute the determinant of a matrix larger than \(3 \times 3\) by hand. If you are so unlucky, it is best to use an inductive procedure that builds it up from the determinants of smaller submatrices. We will discuss this procedure below.

If \(\operatorname{det} A=0\), the matrix has no inverse. It is not “invertible.” In this case, the simultaneous linear equations have either no solution at all, or an infinite number of solutions. If \(\operatorname{det} A \neq 0\), the inverse matrix exists and is uniquely given by \[A^{-1}=\frac{\tilde{A}}{\operatorname{det} A}\]

where \(\tilde{A}\) is the cofactor matrix defined by its matrix elements as follows: \[(\AA)_{j k}=\operatorname{det} A(j k)\]

with \[\begin{aligned}
&A(j k)_{l m}=1 \text { if } m=j \text { and } l=k \\
&A(j k)_{l m}=0 \text { if } m=j \text { and } l \neq k \\
&A(j k)_{l m}=0 \text { if } m \neq j \text { and } l=k \\
&A(j k)_{l m}=A_{l m} \text { if } m \neq j \text { and } l \neq k
\end{aligned}\]

In other words, \(A(jk)\) is obtained from the matrix \(A\) by replacing the \(kj\) matrix element by 1 and all other matrix elements in row \(k\) or column \(j\) by 0. Thus if \[A=\left(\begin{array}{cc|ccc}
A_{11} & \cdots & A_{1 j} & \cdots & A_{1 N} \\
\vdots & \ddots & \vdots & \ddots & \vdots \\
\hline A_{k 1} & \cdots & A_{k j} & \cdots & A_{k N} \\
\hline \vdots & \ddots & \vdots & \ddots & \vdots \\
A_{N 1} & \cdots & A_{N j} & \cdots & A_{N N}
\end{array}\right) ,\]

\[A(j k)=\left(\begin{array}{cc|ccc}
A_{11} & \cdots & 0 & \cdots & A_{1 N} \\
\vdots & \ddots & \vdots & \ddots & \vdots \\
\hline 0 & \cdots & 1 & \cdots & 0 \\
\hline \vdots & \ddots & \vdots & \ddots & \vdots \\
A_{N 1} & \cdots & 0 & \cdots & A_{N N}
\end{array}\right) .\]

Note the sneaky interchange of \(j \leftrightarrow k\) in this definition, compared to (3.23).

For example if \[A=\left(\begin{array}{ll}
4 & 3 \\
5 & 2
\end{array}\right)\]

then \[\begin{aligned}
&A(11)=\left(\begin{array}{ll}
1 & 0 \\
0 & 2
\end{array}\right) & A(12)=\left(\begin{array}{ll}
0 & 3 \\
1 & 0
\end{array}\right) \\
&A(21)=\left(\begin{array}{ll}
0 & 1 \\
5 & 0
\end{array}\right) & A(22)=\left(\begin{array}{ll}
4 & 0 \\
0 & 1
\end{array}\right) .
\end{aligned}\]

Thus, \[\bar{A}=\left(\begin{array}{cc}
2 & -3 \\
-5 & 4
\end{array}\right)\]

and since \(\operatorname{det} A=4 \cdot 2-5 \cdot 3=-7\), \[A^{-1}=\left(\begin{array}{cc}
-2 / 7 & 3 / 7 \\
5 / 7 & -4 / 7
\end{array}\right) .\]

\(A^{-1}\) satisfies \(A A^{-1}=A^{-1} A=I\) where \(I\) is the identity matrix: \[I=\left(\begin{array}{ll}
1 & 0 \\
0 & 1
\end{array}\right) .\]

In terms of the submatrices, \(A(jk)\), we can define the determinant inductively, as promised above. In fact, the reason that (3.30) works is that the determinant can be written as \[\operatorname{det} A=\sum_{k=1}^{N} A_{1 k} \operatorname{det} A(k 1) .\]

Actually this is true for any row, not just \(j = 1\). The relation, (3.30) can be rewritten as \[\sum_{k=1}^{N} A_{j k} \operatorname{det} A\left(k j^{\prime}\right)=\left\{\begin{array}{c}
\operatorname{det} A \text { for } j=j^{\prime} \\
0 \text { for } j \neq j^{\prime}
\end{array}\right.\]

The determinants of the submatrices, \(\operatorname{det} A(k \mathrm{l})\), in (3.43) can, in turn, be computed by the same procedure. The result is a definition of the determinant that refers to itself. However, eventually, the process terminates because the matrices keep getting smaller and the determinant can always be computed in this way. The only problem with this procedure is that it is very tedious for a large matrix. For an \(n \times n\) matrix, you end up computing \(n!\) terms and adding them up. For large \(n\), this is impractical. One of the nice features of the techniques that we will discuss in the coming chapters is that we will be able to avoid such calculations.

More Useful Facts about Matrices

Suppose that \(A\) and \(B\) are \(N \times N\) matrices and \(v\) is an \(N\)-vector.

If you know the inverses of \(A\) and \(B\), you can find the inverse of the product, \(AB\), by multiplying the inverses in the reverse order: \[(A B)^{-1}=B^{-1} A^{-1} .\]
The determinant of the product, \(AB\), is the product of the determinants: \[\operatorname{det}(A B)=\operatorname{det} A \operatorname{det} B ,\]
thus if \(\operatorname{det}(A B)-0\), then either \(A\) or \(B\) has vanishing determinant.
A matrix multiplying a nonzero vector can give zero only if the determinant of the matrix vanishes: \[A v=0 \Rightarrow \operatorname{det} A=0 \text { or } v=0\]
This is the statement, in matrix language, that \(N\) homogeneous linear equations in \(N\) unknowns can have a nontrivial solution, \(v \neq 0\), only if the determinant of the coefficients vanishes.
Similarly, if \(\operatorname{det} A=0\), there exists a nonzero vector, \(v\), that is annihilated by \(A\): \[\operatorname{det} A=0 \Rightarrow \exists v \neq 0 \text { such that } A v=0 .\]
This is the statement, in matrix language, that \(N\) homogeneous linear equations in \(N\) unknowns actually do have a nontrivial solution, \(v \neq 0\), if the determinant of the coefficients vanishes.
The transpose of an \(N \times M\) matrix \(A\), denoted by \(A^{T}\), is the \(M \times N\) matrix obtained by reflecting the matrix about a diagonal line through the upper left-hand corner. Thus if \[A=\left(\begin{array}{cccc}
A_{11} & A_{12} & \cdots & A_{1 M} \\
A_{21} & A_{22} & \cdots & A_{2 M} \\
\vdots & \vdots & \ddots & \vdots \\
\vdots & \vdots & \ddots & \vdots \\
A_{N 1} & A_{N 2} & \cdots & A_{N M}
\end{array}\right)\]
then \[A^{T}=\left(\begin{array}{ccccc}
A_{11} & A_{21} & \cdots & \cdots & A_{N 1} \\
A_{12} & A_{22} & \cdots & \cdots & A_{N 2} \\
\vdots & \vdots & \ddots & \ddots & \vdots \\
A_{1 M} & A_{2 M} & \cdots & \cdots & A_{N M}
\end{array}\right) .\]
Note that if \(N \neq M\), the shape of the matrix is changed by transposition. Only for square matrices does the transpose give you back a matrix of the same kind. A square matrix that is equal to its transpose is called a “symmetric” matrix.

Eigenvalue Equations

We will make extensive use of the concept of an “eigenvalue equation.” For an \(N \times N\) matrix, \(R\), the eigenvalue equation has the form: \[R c=h c,\]

where \(c\) is a nonzero \(N\)-vector,¹ and \(h\) is a number. The idea is to find both the number, \(h\), which is called the eigenvalue, and the vector, \(c\), which is called the eigenvector. This is the problem we discussed in chapter 1 in (1.78) in connection with time translation invariance, but now written in matrix form.

A couple of examples may be in order. Suppose that \(R\) is a diagonal matrix, like \[R=\left(\begin{array}{ll}
2 & 0 \\
0 & 1
\end{array}\right) .\]

Then the eigenvalues are just the diagonal elements, 2 and 1, and the eigenvectors are vectors in the coordinate directions, \[R\left(\begin{array}{l}
1 \\
0
\end{array}\right)=2\left(\begin{array}{l}
1 \\
0
\end{array}\right), \quad R\left(\begin{array}{l}
0 \\
1
\end{array}\right)=1\left(\begin{array}{l}
0 \\
1
\end{array}\right) .\]

A less obvious example is \[R=\left(\begin{array}{ll}
2 & 1 \\
1 & 2
\end{array}\right) .\]

This time the eigenvalues are 3 and 1, and the eigenvectors are as shown below: \[R\left(\begin{array}{l}
1 \\
1
\end{array}\right)=3\left(\begin{array}{l}
1 \\
1
\end{array}\right), \quad R\left(\begin{array}{c}
1 \\
-1
\end{array}\right)=1\left(\begin{array}{c}
1 \\
-1
\end{array}\right) .\]

It may seem odd that in the eigenvalue equation, both the eigenvalue and the eigenvector are unknowns. The reason that it works is that for most values of \(h\), the equation, (3.51), has no solution. To see this, we write (3.51) as a set of homogeneous linear equations for the components of the eigenvector, \(c\), \[(R-h I) c=0 .\]

The set of equations, (3.56), has nonzero solutions for \(c\) only if the determinant of the coefficient matrix, \(R-h I\), vanishes. But this will happen only for \(N\) values of \(h\), because the condition \[\operatorname{det}(R-h I)=0\]

is an \(N\)th order equation for \(h\). For each \(h\) that solves (3.57), we can find a solution for \(c\).² We will give some examples of this procedure below.

Matrix Equation of Motion

It is very useful to rewrite the equation of motion, (3.14), in a matrix notation. Define a column vector, \(X\), whose \(j\)th row (from the top) is the coordinate \(x_{j}\): \[X=\left(\begin{array}{c}
x_{1} \\
x_{2} \\
\vdots \\
x_{n}
\end{array}\right) .\]

Define the “\(K\) matrix”, an \(n \times n\) matrix that has the coefficient \(K_{jk}\) in its \(j\)th row and \(k\)th column: \[K=\left(\begin{array}{cccc}
K_{11} & K_{12} & \cdots & K_{1 n} \\
K_{21} & K_{22} & \cdots & K_{2 n} \\
\vdots & \vdots & \ddots & \vdots \\
K_{n 1} & K_{n 2} & \cdots & K_{n n}
\end{array}\right) .\]

\(K_{jk}\) is said to be the “\(jk\) matrix element” of the \(K\) matrix. Because of equation (3.19), the matrix \(K\) is symmetric, \(K = K^{T}\).

Define the diagonal matrix \(M\) with \(m_{j}\) in the \(j\)th row and \(j\)th column and zeroes elsewhere \[M=\left(\begin{array}{cccc}
m_{1} & 0 & \cdots & 0 \\
0 & m_{2} & \cdots & 0 \\
\vdots & \vdots & \ddots & \vdots \\
0 & 0 & \cdots & m_{n}
\end{array}\right) .\]

\(M\) is called the “mass matrix.”

Using these definitions, we can rewrite (3.14) in matrix notation as follows: \[M \frac{d^{2} X}{d t^{2}}=-K X .\]

There is nothing very fancy going on here. We have just used the matrix notation to get rid of the summation sign in (3.14). The sum is now implicit in the matrix multiplication in (3.61). This is useful because we can now use the properties of matrices and matrix multiplication discussed above to manipulate (3.61). For example, we can simplify (3.61) a bit by multiplying on the left by \(M^{-1}\) to get \[\frac{d^{2} X}{d t^{2}}=-M^{-1} K X .\]

_____________________
¹\(c = 0\) doesn’t count, because the equation is satisfied trivially for any \(h\). We are interested only in nontrivial solutions.
²The situation is slightly more complicated when the solutions for \(h\) are degenerate. We discuss this in (3.117) below.