Search

Text Color

Margin Size

Font Type

Enable Dyslexic Font

19.2: Appendix - Matrix Algebra

Last updated

Jun 29, 2021
Save as PDF
- 19.1: Introduction
- 19.3: Appendix - Vector algebra

$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$

$\newcommand{\id}{\mathrm{id}}$ $\newcommand{\Span}{\mathrm{span}}$

( \newcommand{\kernel}{\mathrm{null}\,}\) $\newcommand{\range}{\mathrm{range}\,}$

$\newcommand{\RealPart}{\mathrm{Re}}$ $\newcommand{\ImaginaryPart}{\mathrm{Im}}$

$\newcommand{\Argument}{\mathrm{Arg}}$ $\newcommand{\norm}[1]{\| #1 \|}$

$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$\newcommand{\Span}{\mathrm{span}}$

$\newcommand{\id}{\mathrm{id}}$

$\newcommand{\Span}{\mathrm{span}}$

$\newcommand{\kernel}{\mathrm{null}\,}$

$\newcommand{\range}{\mathrm{range}\,}$

$\newcommand{\RealPart}{\mathrm{Re}}$

$\newcommand{\ImaginaryPart}{\mathrm{Im}}$

$\newcommand{\Argument}{\mathrm{Arg}}$

$\newcommand{\norm}[1]{\| #1 \|}$

$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$\newcommand{\Span}{\mathrm{span}}$ $\newcommand{\AA}{\unicode[.8,0]{x212B}}$

$\newcommand{\vectorA}[1]{\vec{#1}} % arrow$

$\newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow$

$\newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vectorC}[1]{\textbf{#1}}$

$\newcommand{\vectorD}[1]{\overrightarrow{#1}}$

$\newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}}$

$\newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}}$

$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$

$\newcommand{\avec}{\mathbf a}$

$\newcommand{\bvec}{\mathbf b}$

$\newcommand{\cvec}{\mathbf c}$

$\newcommand{\dvec}{\mathbf d}$

$\newcommand{\dtil}{\widetilde{\mathbf d}}$

$\newcommand{\evec}{\mathbf e}$

$\newcommand{\fvec}{\mathbf f}$

$\newcommand{\nvec}{\mathbf n}$

$\newcommand{\pvec}{\mathbf p}$

$\newcommand{\qvec}{\mathbf q}$

$\newcommand{\svec}{\mathbf s}$

$\newcommand{\tvec}{\mathbf t}$

$\newcommand{\uvec}{\mathbf u}$

$\newcommand{\vvec}{\mathbf v}$

$\newcommand{\wvec}{\mathbf w}$

$\newcommand{\xvec}{\mathbf x}$

$\newcommand{\yvec}{\mathbf y}$

$\newcommand{\zvec}{\mathbf z}$

$\newcommand{\rvec}{\mathbf r}$

$\newcommand{\mvec}{\mathbf m}$

$\newcommand{\zerovec}{\mathbf 0}$

$\newcommand{\onevec}{\mathbf 1}$

$\newcommand{\real}{\mathbb R}$

$\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}$

$\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}$

$\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}$

$\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}$

$\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$

$\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$

$\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$

$\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$

$\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}$

$\newcommand{\laspan}[1]{\text{Span}\{#1\}}$

$\newcommand{\bcal}{\cal B}$

$\newcommand{\ccal}{\cal C}$

$\newcommand{\scal}{\cal S}$

$\newcommand{\wcal}{\cal W}$

$\newcommand{\ecal}{\cal E}$

$\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}$

$\newcommand{\gray}[1]{\color{gray}{#1}}$

$\newcommand{\lgray}[1]{\color{lightgray}{#1}}$

$\newcommand{\rank}{\operatorname{rank}}$

$\newcommand{\row}{\text{Row}}$

$\newcommand{\col}{\text{Col}}$

$\renewcommand{\row}{\text{Row}}$

$\newcommand{\nul}{\text{Nul}}$

$\newcommand{\var}{\text{Var}}$

$\newcommand{\corr}{\text{corr}}$

$\newcommand{\len}[1]{\left|#1\right|}$

$\newcommand{\bbar}{\overline{\bvec}}$

$\newcommand{\bhat}{\widehat{\bvec}}$

$\newcommand{\bperp}{\bvec^\perp}$

$\newcommand{\xhat}{\widehat{\xvec}}$

$\newcommand{\vhat}{\widehat{\vvec}}$

$\newcommand{\uhat}{\widehat{\uvec}}$

$\newcommand{\what}{\widehat{\wvec}}$

$\newcommand{\Sighat}{\widehat{\Sigma}}$

$\newcommand{\lt}{<}$

$\newcommand{\gt}{>}$

$\newcommand{\amp}{&}$

$\definecolor{fillinmathshade}{gray}{0.9}$

Matrices

Matrix algebra provides an elegant and powerful representation of multivariate operators, and coordinate transformations that feature prominently in classical mechanics. For example they play a pivotal role in finding the eigenvalues and eigenfunctions for coupled equations that occur in rigid-body rotation, and coupled oscillator systems. An understanding of the role of matrix mechanics in classical mechanics facilitates understanding of the equally important role played by matrix mechanics in quantal physics.

It is interesting that although determinants were used by physicists in the late $19^{th}$ century, and the concept of matrix algebra was developed by Arthur Cayley in England in 1855, many of these ideas were the work of Hamilton, and the discussion of matrix algebra was buried in a more general discussion of determinants. Matrix algebra was an esoteric branch of mathematics, little known by the physics community, until 1925 when Heisenberg proposed his innovative new quantum theory. The striking feature of this new theory was its representation of physical quantities by sets of time-dependent complex numbers and a peculiar multiplication rule. Max Born recognized that Heisenberg’s multiplication rule is just the standard “row times column” multiplication rule of matrix algebra; a topic that he had encountered as a young student in a mathematics course. In 1924 Richard Courant had just completed the first volume of the new text Methods of Mathematical Physics during which Pascual Jordan had served as his young assistant working on matrix manipulation. Fortuitously, Jordan and Born happened to share a carriage on a train to Hanover during which Jordan overheard Born talk about his problems trying to work with matrices. Jordan introduced himself to Born and offered to help. This led to publication, in September 1925, of the famous Born-Jordan paper[Bor25a] that gave the first rigorous formulation of matrix mechanics in physics. This was followed in November by the Born-Heisenberg-Jordan sequel[Bor25b] that established a logical consistent general method for solving matrix mechanics problems plus a connection between the mathematics of matrix mechanics and linear algebra. Matrix algebra developed into an important tool in mathematics and physics during World War 2 and now it is an integral part of undergraduate linear algebra courses.

Most applications of matrix algebra in this book are restricted to real, symmetric, square matrices. The size of a matrix is defined by the rank, which equals the row rank and column rank, i.e. the number of independent row vectors or column vectors in the square matrix. It is presumed that you have studied matrices in a linear algebra course. Thus the goal of this review is to list simple manipulation of symmetric matrices and matrix diagonalization that will be used in this course. You are referred to a linear algebra textbook if you need further details.

Matrix definition

A matrix is a rectangular array of numbers with $M$ rows and $N$ columns. The notation used for an element of a matrix is $A_{ij}$ where $i$ designates the row and $j$ designates the column of this matrix element in the matrix $\mathbf{A}$ . Convention denotes a matrix $\mathbf{A}$ as

$\mathbf{A} \equiv \begin{pmatrix} A_{11} & A_{12} & \dots & A_{1(N−1)} & A_{1N} \\ A_{21} & A_{22} & .. & A_{2(N−1)} & A_{2N} \\ : & : & A_{ij} & : & : \\ A_{(M−1)1} & A_{(M−1)2} & .. & A_{(M−1)(N−1)} & A_{(M−1)N} \\ A_{M1} & A_{M2} & \dots & A_{M(N−1)} & A_{MN} \end{pmatrix} \label{A.1}$

Matrices can be square, $M = N$ , or rectangular $M \neq N$ . Matrices having only one row or column are called row or column vectors respectively, and need only a single subscript label. For example,

$\mathbf{A} = \begin{pmatrix} A_1 \\ A_2 \\ : \\ A_{M−1} \\ A_M \end{pmatrix} \label{A.2}$

Matrix manipulation

Matrices are defined to obey certain rules for matrix manipulation as given below.

1) Multiplication of a matrix by a scalar $\lambda$ simply multiplies each matrix element by $\lambda$ .

$C_{ij} = \lambda A_{ij} \label{A.3}$

2) Addition of two matrices $\mathbf{A}$ and $\mathbf{B}$ having the same rank, i.e. the number of columns, is given by

$C_{ij} = A_{ij} + B_{ij} \label{A.4}$

3) Multiplication of a matrix $\mathbf{A}$ by a matrix $\mathbf{B}$ is defined only if the number of columns in $\mathbf{A}$ equals the number of rows in $\mathbf{B}$ . The product matrix $\mathbf{C}$ is given by the matrix product

$\mathbf{C} = \mathbf{A} \cdot \mathbf{B} \label{A.5}$

$C_{ij} = [AB]_{ij} = \sum_k A_{ik}B_{kj} \label{A.6}$

For example, if both $\mathbf{A}$ and $\mathbf{B}$ are rank three symmetric matrices then

$\begin{align*} \mathbf{C} &= \mathbf{A} \cdot \mathbf{B} \\[4pt] &= \begin{pmatrix} A_{11} & A_{12} & A_{13} \\ A_{21} & A_{22} & A_{23} \\ A_{31} & A_{32} & A_{33} \end{pmatrix} \cdot \begin{pmatrix} B_{11} & B_{12} & B_{13} \\ B_{21} & B_{22} & B_{23} \\ B_{31} & B_{32} & B_{33} \end{pmatrix} \\[4pt] &= \begin{pmatrix} A_{11}B_{11} + A_{12}B_{21} + A_{13}B_{31} & A_{11}B_{12} + A_{12}B_{22} + A_{13}B_{32} & A_{11}B_{13} + A_{12}B_{23} + A_{13}B_{33} \\ A_{21}B_{11} + A_{22}B_{21} + A_{23}B_{31} & A_{21}B_{12} + A_{22}B_{22} + A_{23}B_{32} & A_{21}B_{13} + A_{22}B_{23} + A_{23}B_{33} \\ A_{31}B_{11} + A_{32}B_{21} + A_{33}B_{31} & A_{31}B_{12} + A_{32}B_{22} + A_{33}B_{32} & A_{31}B_{13} + A_{32}B_{23} + A_{33}B_{33} \end{pmatrix} \end{align*}$

In general, multiplication of matrices $\mathbf{A}$ and $\mathbf{B}$ is noncommutative, i.e.

$\mathbf{A} \cdot \mathbf{B} \neq \mathbf{B} \cdot \mathbf{A} \label{A.7}$

In the special case when $\mathbf{A} \cdot \mathbf{B} = \mathbf{B} \cdot \mathbf{A}$ then the matrices are said to commute.

Transposed matrix $\mathbf{A}^T$

The transpose of a matrix $\mathbf{A}$ will be denoted by $\mathbf{A}^T$ and is given by interchanging rows and columns, that is

$\left( A^T \right)_{ij} = A_{ji} \label{A.8}$

The transpose of a column vector is a row vector. Note that older texts use the symbol $\mathbf{\tilde{A}}$ for the transpose.

Identity (unity) matrix $\mathbb{I}$

The identity (unity) matrix $\mathbb{I}$ is diagonal with diagonal elements equal to 1, that is

$\mathbb{I}_{ij} = \delta_{ij} \label{A.9}$

where the Kronecker delta symbol is defined by

$\begin{align} \delta_{ik} & = 0 && \text{ if } i \neq k \label{A.10} \\ & = 1 && \text{ if } i = k \nonumber\end{align}$

Inverse matrix $\mathbf{A}^{−1}$

If a matrix is non-singular, that is, its determinant is non-zero, then it is possible to define an inverse matrix $\mathbf{A}^{−1}$ . A square matrix has an inverse matrix for which the product

$\mathbf{A} \cdot \mathbf{A}^{−1} = \mathbb{I} \label{A.11}$

Orthogonal matrix

A matrix with real elements is orthogonal if

$\mathbf{A}^T = \mathbf{A}^{−1} \label{A.12}$

That is

$\sum_k \left( A^T \right)_{ik} A_{kj} = \sum_k A_{ki}A_{kj} = \delta_{ij} \label{A.13}$

Adjoint matrix $A^{\dagger}$

For a matrix with complex elements, the adjoint matrix, denoted by $A^{\dagger}$ is defined as the transpose of the complex conjugate

$\left( A^{\dagger}\right)_{ij} = A^{*}_{ji} \label{A.14}$

Hermitian matrix

The Hermitian conjugate of a complex matrix $\mathbf{H}$ is denoted as $\mathbf{H}^{\dagger}$ and is defined as

$\mathbf{H}^{\dagger} = \left( \mathbf{H}^T \right)^{*} = (\mathbf{H}^{*})^T \label{A.15}$

Therefore

$H^{\dagger}_{ij} = H^{*}_{ji} \label{A.16}$

A matrix is Hermitian if it is equal to its adjoint

$\mathbf{H}^{\dagger} = \mathbf{H} \label{A.17}$

that is

$H^{\dagger}_{ij} = H^{*}_{ji} = H_{ij} \label{A.18}$

A matrix that is both Hermitian and has real elements is a symmetric matrix since complex conjugation has no effect.

Unitary matrix

A matrix with complex elements is unitary if its inverse is equal to the adjoint matrix

$\mathbf{U}^{\dagger} = \mathbf{U}^{−1} \label{A.19}$

which is equivalent to

$\mathbf{U}^{\dagger}\mathbf{U} = \mathbb{I} \label{A.20}$

A unitary matrix with real elements is an orthogonal matrix as given in Equation $\ref{A.12}$ .

Trace of a square matrix $Tr \mathbf{A}$

The trace of a square matrix, denoted by $Tr\mathbf{A}$ , is defined as the sum of the diagonal matrix elements.

$Tr\mathbf{A} = \sum^N_{i=1} A_{ii} \label{A.21}$

Inner product of column vectors

Real vectors

The generalization of the scalar (dot) product in Euclidean space is called the inner product. Exploiting the rules of matrix multiplication requires taking the transpose of the first column vector to form a row vector which then is multiplied by the second column vector using the conventional rules for matrix multiplication. That is, for rank $N$ vectors

$[\mathbf{X}] \cdot [\mathbf{Y}] = \begin{pmatrix} X_1 \\ X_2 \\ : \\ X_N \end{pmatrix} \cdot \begin{pmatrix} Y_1 \\ Y_2 \\ : \\ Y_N \end{pmatrix} = [\mathbf{X}]^T [\mathbf{Y}] = \begin{pmatrix} X_1 & X_2 & .. & X_N \end{pmatrix} \begin{pmatrix} Y_1 \\ Y_2 \\ : \\ Y_N \end{pmatrix} = \sum^N_{i=1} X_iY_i \label{A.22}$

For rank $N = 3$ this inner product agrees with the conventional definition of the scalar product and gives a result that is a scalar. For the special case when $[\mathbf{A}] \cdot [\mathbf{B}]=0$ then the two matrices are called orthogonal. The magnitude squared of a column vector is given by the inner product

$[\mathbf{X}] \cdot [\mathbf{X}] = \sum^N_{i=1} (X_i)^2 \geq 0 \label{A.23}$

Note that this is only positive.

Complex vectors

For vectors having complex matrix elements the inner product is generalized to a form that is consistent with Equation $\ref{A.22}$ when the column vector matrix elements are real.

$[\mathbf{X}]^{*} \cdot [\mathbf{Y}]=[\mathbf{X}]^{\dagger} [\mathbf{Y}] = \begin{pmatrix} X^{*}_1 & X^{*}_2 & .. & X^{*}_{N−1} & X^{*}_N \end{pmatrix} \begin{pmatrix} Y_1 \\ Y_2 \\ : \\ Y_{N−1} \\ Y_N \end{pmatrix} = \sum^N_{i=1} X^{*}_i Y_i \label{A.24}$

For the special case

$[\mathbf{X}]^{*} \cdot [\mathbf{X}]=[\mathbf{X}]^{\dagger} [\mathbf{X}] = \sum^N_{i=1} X^{*}_i X_i \geq 0 \label{A.25}$

Determinants

Definition

The determinant of a square matrix with $N$ rows equals a single number derived using the matrix elements of the matrix. The determinant is denoted as $\det \mathbf{A}$ or $|\mathbf{A}|$ where

$|\mathbf{A}| = \sum^N_{j=1} \varepsilon (j_1, j_2, \dots .j_N )A_{1j_1}A_{2j_2} \dots A_{Nj_N} \label{A.26}$

where $\varepsilon (j_1, j_2, \dots .j_N )$ is the permutation index which is either even or odd depending on the number of permutations required to go from the normal order $(1, 2, 3, \dots N)$ to the sequence $(j_1j_2j_3\dots j_N )$ .

For example for $N = 3$ the determinant is

$|\mathbf{A}| = A_{11}A_{22}A_{33} + A_{12}A_{23}A_{31} + A_{13}A_{21}A_{32} − A_{13}A_{22}A_{31} − A_{11}A_{23}A_{32} − A_{12}A_{21}A_{33} \label{A.27}$

Properties

The value of a determinant |A|=0, if
1. all elements of a row (column) are zero.
2. all elements of a row (column) are identical with, or multiples of, the corresponding elements of another row (column).
The value of a determinant is unchanged if
1. rows and columns are interchanged.
2. a linear combination of any number of rows is added to any one row.
The value of a determinant changes sign if two rows, or any two columns, are interchanged.
Transposing a square matrix does not change its determinant. $\left|\mathbf{A}^T\right| = |\mathbf{A}|$
If any row (column) is multiplied by a constant factor then the value of the determinant is multiplied by the same factor.
The determinant of a diagonal matrix equals the product of the diagonal matrix elements. That is, when $A_{ij} = \lambda i\delta_{ij}$ then $|\mathbf{A}| = \lambda_1\lambda_2\lambda_3\dots \lambda_N$
The determinant of the identity (unity) matrix $|\mathbb{I}| = 1$ .
The determinant of the null matrix, for which all matrix elements are zero, $|\mathbf{0}| = 0$
A singular matrix has a determinant equal to zero.
If each element of any row (column) appears as the sum (difference) of two or more quantities, then the determinant can be written as a sum (difference) of two or more determinants of the same order. For example for order $N = 2$ , $\begin{vmatrix} A_{11} \pm B_{11} & A_{12} \pm B_{12} \\ A_{21} & A_{22} \end{vmatrix} = \begin{vmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{vmatrix} \pm \begin{vmatrix} B_{11} & B_{12} \\ A_{21} & A_{22} \end{vmatrix} \nonumber$
A determinant of a matrix product equals the product of the determinants. That is, if $\mathbf{C} = \mathbf{AB}$ then $|\mathbf{C}| = |\mathbf{A}| |\mathbf{B}|$

Cofactor of a square matrix

For a square matrix having $N$ rows the cofactor is obtained by removing the $i^{th}$ row and the $j^{th}$ column and then collapsing the remaining matrix elements into a square matrix with $N − 1$ rows while preserving the order of the matrix elements. This is called the complementary minor which is denoted as $A^{(ij)}$ . The matrix elements of the cofactor square matrix $\mathbf{a}$ are obtained by multiplying the determinant of the $(ij)$ complementary minor by the phase factor $(−1)^{i+j}$ . That is

$a_{ij} = (−1)^{i+j} \left| A^{(ij)} \right| \label{A.28}$

The cofactor matrix has the property that

$\sum^N_{k=1} A_{ik}a_{jk} = \delta_{ij} |\mathbf{A}| = \sum^N_{k=1} A_{ki}a_{kj} \label{A.29}$

Cofactors are used to expand the determinant of a square matrix in order to evaluate the determinant.

Inverse of a non-singular matrix

The $(i, j)$ matrix elements of the inverse matrix $\mathbf{A}^{−1}$ of a non-singular matrix $\mathbf{A}$ are given by the ratio of the cofactor $a_{ji}$ and the determinant $|\mathbf{A}|$ , that is

$\mathbf{A}^{−1}_{ij} = \frac{1}{ |\mathbf{A}|} a_{ji} \label{A.30}$

Equations $\ref{A.28}$ and $\ref{A.29}$ can be used to evaluate the $i, j$ element of the matrix product $\left( \mathbf{A}^{−1}\mathbf{A}\right)$

$\left( \mathbf{A}^{−1}\mathbf{A}\right)_{ij} = \sum^N_{k=1} \mathbf{A}^{−1}_{ik} A_{kj} = \frac{1}{ |\mathbf{A}|} \sum^N_{k=1} a_{ji}A_{kj} = \frac{1}{ |\mathbf{A}|} \delta_{ji} |\mathbf{A}| = \delta_{ij} = \mathbb{I}_{ij} \label{A.31}$

This agrees with Equation $\ref{A.11}$ that $\mathbf{A} \cdot \mathbf{A}^{−1} = \mathbb{I}$ .

The inverse of rank 2 or 3 matrices is required frequently when determining the eigen-solutions for rigidbody rotation, or coupled oscillator, problems in classical mechanics as described in chapters $11$ and $12$ . Therefore it is convenient to list explicitly the inverse matrices for both rank 2 and rank 3 matrices.

Inverse for rank 2 matrices:

$\mathbf{A}^{−1} = \begin{bmatrix} a & b \\ c & d \end{bmatrix}^{−1} = \frac{1}{ |\mathbf{A}|} \begin{bmatrix} d & −b \\ −c & a \end{bmatrix} = \frac{1}{ (a d − bc)} \begin{bmatrix} d & −b \\ −c & a \end{bmatrix} \label{A.32}$

where the determinant of $\mathbf{A}$ is written explicitly in Equation $\ref{A.32}$ .

Inverse for rank 3 matrices:

$\mathbf{A}^{−1} =\begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix}^{−1} = \frac{1}{ |\mathbf{A}|} \begin{bmatrix} A & B & C \\ D & E & F \\ G & H & I \end{bmatrix}^T = \frac{1}{ |\mathbf{A}|} \begin{bmatrix} A & D & G \\ B & E & H \\ C & F & I \end{bmatrix} \\ = \frac{1}{ a A + bB + cC} \begin{bmatrix} A = (ei − fh) & D = − (bi − ch) & G = (bf − ce) \\ B = − (di − fg) & E = (a i − cg) & H = − (a f − cd) \\ C = (dh − eg) & F = − (a h − bg) & I = (a e − bd) \end{bmatrix} \label{A.33}$

where the functions $A, B, C, D, E, F, G, H, I$ , are equal to rank 2 determinants listed in Equation $\ref{A.33}$ .

Reduction of a matrix to diagonal form

Solving coupled linear equations can be reduced to diagonalization of a matrix. Consider the matrix $\mathbf{A}$ operating on the vector $\mathbf{X}$ to produce a vector $\mathbf{Y}$ , that are expressed as components with respect to the unprimed coordinate frame, i.e.

$\mathbf{A} \cdot \mathbf{X} = \mathbf{Y} \label{A.34}$

Consider that the unitary real matrix $\mathbf{R}$ with rank $n$ , rotates the $n$ -dimensional un-primed coordinate frame into the primed coordinate frame such that $\mathbf{A}$ , $\mathbf{X}$ and $\mathbf{Y}$ are transformed to $\mathbf{A}^{\prime}$ , $\mathbf{X}^{\prime}$ and $\mathbf{Y}^{\prime}$ in the rotated primed coordinate frame. Then

$\mathbf{X}^{\prime} = \mathbf{R} \cdot \mathbf{X} \\ \mathbf{Y}^{\prime} = \mathbf{R} \cdot \mathbf{Y} \label{A.35}$

With respect to the primed coordinate frame Equation $\ref{A.34}$ becomes

$\mathbf{R}\cdot(\mathbf{A} \cdot \mathbf{X}) = \mathbf{R} \cdot \mathbf{Y} \label{A.36}$

$\mathbf{R} \cdot \mathbf{A} \cdot \mathbf{R}^{−1} \cdot \mathbf{R} \cdot \mathbf{X} = \mathbf{R} \cdot \mathbf{Y} \label{A.37}$

$\mathbf{R} \cdot \mathbf{A} \cdot \mathbf{R}^{−1} \cdot \mathbf{X}^{\prime} = \mathbf{A}^{\prime} \cdot \mathbf{X}^{\prime} = \mathbf{Y}^{\prime} \label{A.38}$

using the fact that the identity matrix $\mathbf{I} = \mathbf{R} \cdot \mathbf{R}^{−1} = \mathbf{R} \cdot \mathbf{R}^T$ since the rotation matrix in $n$ dimensions is orthogonal.

Thus we have that the rotated matrix

$\mathbf{A}^{\prime} = \mathbf{R} \cdot \mathbf{A} \cdot \mathbf{R}^T \label{A.39}$

Let us assume that this transformed matrix is diagonal, then it can be written as the product of the unit matrix $\mathbb{I}$ and a vector of scalar numbers called the characteristic roots $\lambda$ as

$\mathbf{A}^{\prime} = \mathbf{R} \cdot \mathbf{A} \cdot \mathbf{R}^T = \lambda \mathbb{I} \label{A.40}$

using the fact that $\mathbf{R}^T= \mathbf{R}^{−1}$ then gives

$\mathbf{R}^T \cdot (\lambda \mathbb{I}) = \mathbf{A}^{\prime} \cdot \mathbf{R}^T \label{A.41}$

Let both sides of Equation $\ref{A.41}$ act on $\mathbf{X}^{\prime}$ which gives

$\lambda \mathbb{I} \cdot \mathbf{X}^{\prime} = \mathbf{A}^{\prime} \cdot\mathbf{X}^{\prime} \label{A.42}$

$[ \lambda \mathbb{I}−\mathbf{A}^{\prime} ] \mathbf{X}^{\prime} = 0 \label{A.43}$

This represents a set of $n$ homogeneous linear algebraic equations in $n$ unknowns $\mathbf{X}^{\prime}$ where $\lambda$ is a set of characteristic roots, (eigenvalues) with corresponding eigenfunctions $\mathbf{X}^{\prime}$ . Ignoring the trivial case of $\mathbf{X}^{\prime}$ being zero, then $\ref{A.43}$ requires that the secular determinant of the bracket be zero, that is

$|\lambda \mathbb{I}−\mathbf{A}^{\prime}| = 0 \label{A.44}$

The determinant can be expanded and factored into the form

$(\lambda − \lambda_1) (\lambda − \lambda_2) (\lambda − \lambda_3)\dots .(\lambda − \lambda_n)=0 \label{A.45}$

where the $n$ eigenvalues are $\lambda = \lambda_1, \lambda_2, \dots \lambda_n$ of the matrix $\mathbf{A}^{\prime}$ .

The eigenvectors $\mathbf{X}^{\prime}$ corresponding to each eigenvalue are determined by substituting a given eigenvalue $\lambda_i$ into the relation

$\mathbf{X}^{\prime T} \cdot \mathbf{A}^{\prime} \cdot \mathbf{X}^{\prime} = [\lambda_i \delta_{ij} ] \label{A.46}$

If all the eigenvalues are distinct, i.e. different, then this set of $n$ equations completely determines the ratio of the components of each eigenvector along the axes of the coordinate frame. However, when two or more eigenvalues are identical, then the reduction to a true diagonal form is not possible and one has the freedom to select an appropriate eigenvector that is orthogonal to the remaining axes.

In summary, the matrix can only be fully diagonalized if

(a) all the eigenvalues are distinct,

(b) the real matrix is symmetric,

A frequent application of matrices in classical mechanics is for solving a system of homogeneous linear equations of the form

$\begin{matrix} A_{11}x_1 & +A_{12}x_2 & \dots \dots & +A_{1n}x_n & = & 0 \\ A_{11}x_1 & +A_{12}x_2 & \dots \dots & +A_{1n}x_n & = & 0 \\ \dots .. & \dots \dots & \dots .. & \dots .. & = & \dots . \\ A_{n1}x_1 & +A_{n2}x_2 & \dots .. & +A_{nn}x_n & = & 0 \end{matrix} \label{A.47}$

Making the following definitions

$\mathbf{A} = \begin{pmatrix} A_{11} & A_{12} & \dots & A_{1n} \\ A_{21} & A_{22} & \dots & A_{2n} \\ \dots & \dots & \dots & \dots \\ A_{n1} & A_{n2} & \dots & A_{nn} \end{pmatrix} \label{A.48}$

$\mathbf{X} = \begin{pmatrix} x_1 \\ x_2 \\ \dots \\ x_n \end{pmatrix} \label{A.49}$

Then the set of linear equations can be written in a compact form using the matrices

$\mathbf{A} \cdot \mathbf{X} =0 \label{A.50}$

which can be solved using Equation $\ref{A.43}$ . Ensure that you are able to diagonalize a matrices with rank 2 and 3. You can use Mathematica, Maple, MatLab, or other such mathematical computer programs to diagonalize larger matrices.

Example $\PageIndex{1}$ : Eigenvalues and eigenvectors of a real symmetric matrix

Consider the matrix

$\mathbf{A} = \begin{pmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix}\nonumber$

The secular determinant is given by $\ref{A.42}$

$\begin{vmatrix} −\lambda & 1 & 0 \\ 1 & −\lambda & 0 \\ 0 & 0 & −\lambda \end{vmatrix} = 0 \nonumber$

This expands to

$−\lambda (\lambda + 1)(\lambda − 1) = 0 \nonumber$

Thus the three eigen values are $\lambda = −1, 0, 1$ .

To find each eigenvectors we substitute the corresponding eigenvalue into Equation $\ref{A.48}$ .

$\begin{pmatrix} −\lambda & 1 & 0 \\ 1 & −\lambda & 0 \\ 0 & 0 & −\lambda \end{pmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \\ 0 \end{pmatrix} \nonumber$

The eigenvalue $\lambda = −1$ yields $x + y = 0$ and $z = 0$ . Thus the eigen vector is $r_1 = ( \frac{1}{\sqrt{2}}, \frac{-1}{\sqrt{2}}, 0)$ . The eigenvalue $\lambda = 0$ yields $x = 0$ and $y = 0$ . Thus the eigen vector is $r_2 = (0, 0, 1)$ . The eigenvalue $\lambda = 1$ yields $−x + y = 0$ and $z = 0$ . Thus the eigen vector is $r_3 = ( \frac{1}{\sqrt{2}}, \frac{1}{\sqrt{2}}, 0)$ . The orthogonality of these three eigen vectors, which correspond to three distinct eigenvalues, can be verified.