4.1: Array Representations of Vectors, Matrices, and Tensors

Last updated

Apr 30, 2021
Save as PDF
- 4: Numerical Linear Algebra
- 4.2: Linear Equations

$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$

$\newcommand{\id}{\mathrm{id}}$ $\newcommand{\Span}{\mathrm{span}}$

( \newcommand{\kernel}{\mathrm{null}\,}\) $\newcommand{\range}{\mathrm{range}\,}$

$\newcommand{\RealPart}{\mathrm{Re}}$ $\newcommand{\ImaginaryPart}{\mathrm{Im}}$

$\newcommand{\Argument}{\mathrm{Arg}}$ $\newcommand{\norm}[1]{\| #1 \|}$

$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$\newcommand{\Span}{\mathrm{span}}$

$\newcommand{\id}{\mathrm{id}}$

$\newcommand{\Span}{\mathrm{span}}$

$\newcommand{\kernel}{\mathrm{null}\,}$

$\newcommand{\range}{\mathrm{range}\,}$

$\newcommand{\RealPart}{\mathrm{Re}}$

$\newcommand{\ImaginaryPart}{\mathrm{Im}}$

$\newcommand{\Argument}{\mathrm{Arg}}$

$\newcommand{\norm}[1]{\| #1 \|}$

$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$\newcommand{\Span}{\mathrm{span}}$ $\newcommand{\AA}{\unicode[.8,0]{x212B}}$

$\newcommand{\vectorA}[1]{\vec{#1}} % arrow$

$\newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow$

$\newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vectorC}[1]{\textbf{#1}}$

$\newcommand{\vectorD}[1]{\overrightarrow{#1}}$

$\newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}}$

$\newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}}$

$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$

$\newcommand{\avec}{\mathbf a}$

$\newcommand{\bvec}{\mathbf b}$

$\newcommand{\cvec}{\mathbf c}$

$\newcommand{\dvec}{\mathbf d}$

$\newcommand{\dtil}{\widetilde{\mathbf d}}$

$\newcommand{\evec}{\mathbf e}$

$\newcommand{\fvec}{\mathbf f}$

$\newcommand{\nvec}{\mathbf n}$

$\newcommand{\pvec}{\mathbf p}$

$\newcommand{\qvec}{\mathbf q}$

$\newcommand{\svec}{\mathbf s}$

$\newcommand{\tvec}{\mathbf t}$

$\newcommand{\uvec}{\mathbf u}$

$\newcommand{\vvec}{\mathbf v}$

$\newcommand{\wvec}{\mathbf w}$

$\newcommand{\xvec}{\mathbf x}$

$\newcommand{\yvec}{\mathbf y}$

$\newcommand{\zvec}{\mathbf z}$

$\newcommand{\rvec}{\mathbf r}$

$\newcommand{\mvec}{\mathbf m}$

$\newcommand{\zerovec}{\mathbf 0}$

$\newcommand{\onevec}{\mathbf 1}$

$\newcommand{\real}{\mathbb R}$

$\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}$

$\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}$

$\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}$

$\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}$

$\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$

$\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$

$\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$

$\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$

$\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}$

$\newcommand{\laspan}[1]{\text{Span}\{#1\}}$

$\newcommand{\bcal}{\cal B}$

$\newcommand{\ccal}{\cal C}$

$\newcommand{\scal}{\cal S}$

$\newcommand{\wcal}{\cal W}$

$\newcommand{\ecal}{\cal E}$

$\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}$

$\newcommand{\gray}[1]{\color{gray}{#1}}$

$\newcommand{\lgray}[1]{\color{lightgray}{#1}}$

$\newcommand{\rank}{\operatorname{rank}}$

$\newcommand{\row}{\text{Row}}$

$\newcommand{\col}{\text{Col}}$

$\renewcommand{\row}{\text{Row}}$

$\newcommand{\nul}{\text{Nul}}$

$\newcommand{\var}{\text{Var}}$

$\newcommand{\corr}{\text{corr}}$

$\newcommand{\len}[1]{\left|#1\right|}$

$\newcommand{\bbar}{\overline{\bvec}}$

$\newcommand{\bhat}{\widehat{\bvec}}$

$\newcommand{\bperp}{\bvec^\perp}$

$\newcommand{\xhat}{\widehat{\xvec}}$

$\newcommand{\vhat}{\widehat{\vvec}}$

$\newcommand{\uhat}{\widehat{\uvec}}$

$\newcommand{\what}{\widehat{\wvec}}$

$\newcommand{\Sighat}{\widehat{\Sigma}}$

$\newcommand{\lt}{<}$

$\newcommand{\gt}{>}$

$\newcommand{\amp}{&}$

$\definecolor{fillinmathshade}{gray}{0.9}$

Thus far, we have discussed simple "one-dimensional" (1D) arrays, which are linear sequences of numbers. In linear algebra terms, 1D arrays represent vectors. The array length corresponds to the "vector dimension" (e.g., a 1D array of length 3 corresponds to a 3-vector). In accordance with Scipy terminology, we will use the work "dimension" to refer to the dimensionality of the array (called the rank in linear algebra), and not the vector dimension.

You are probably familiar with the fact that vectors can be represented using index notation, which is pretty similar to Python's notation for addressing 1D arrays. Consider a length- $d$ vector

$\vec{x} = \begin{bmatrix}x_0\\x_1\\\vdots\\x_{d-1}\end{bmatrix}.$

The $j$ th element can be written as

$x_j \quad \leftrightarrow\quad \texttt{x[j]},$

where $j=0,1,\dots,d-1$ . The notation on the left is mathematical index notation, and the notation on the right is Python's array notation. Note that we are using $0$ -based indexing, so that the first element has index $0$ and the last element has index $d-1$ .

A matrix is a collection of numbers organized using two indices, rather than a single index like a vector. Under $0$ -based indexing, the elements of an $m\times n$ matrix are:

$\mathbf{M} = \begin{bmatrix}M_{00} & M_{01} & \cdots & M_{0,n-1} \\ M_{10} & M_{11}& \cdots & M_{1,n-1} \\ \vdots & \vdots &\ddots&\vdots \\ M_{m-1,0} & M_{m-1,1} & \cdots & M_{m-1,n-1} \end{bmatrix}.$

More generally, numbers that are organized using multiple indices are collectively referred to as tensors. Tensors can have more than two indices. For example, vector cross products are computed using the Levi-Civita tensor $\varepsilon$ , which has three indices:

$\left(\vec{A} \times \vec{B}\right)_i = \sum_{jk} \varepsilon_{ijk} A_j B_k.$

4.1.1 Multi-Dimensional Arrays

In Python, tensors are represented by multi-dimensional arrays, which are similar to 1D arrays except that they are addressed using more than one index. For example, matrices are represented by 2D arrays, and the $(i,j)$ th component of an $m\times n$ matrix is written in Python notation as follows:

$M_{ij} \quad \leftrightarrow\quad \texttt{M[i,j]} \quad\quad \mathrm{for} \;\;i=0,\dots,m-1, \;\;j=0,\dots,n-1.$

Figure $\PageIndex{1}$ : Memory model of a 2D array.

The way multi-dimensional arrays are laid out in memory is very similar to the memory layout of 1D arrays. There is a book-keeping block, which is associated with the array name, and which stores information about the array size (including the number of indices and the size of each index), as well as the memory location of the array contents. The elements lie in a sequence of storage blocks, in a specific order (depending on the array size). This arrangement is shown schematically in Fig. $\PageIndex{1}$ .

When Python needs to access any element of of a multi-dimensional array, it knows exactly which memory location the element is stored in. The location can be worked out from the size of the multi-dimensional array, and the memory location of the first element. In Fig. $\PageIndex{1}$ , for example, M is a $2\times 3$ array, containing $6$ storage blocks laid out in a specific sequence. If we need to access M[1,1], Python knows that it needs to jump to the storage block four blocks down from the $(0,0)$ block. Hence, reading/writing the elements of a multi-dimensional array is an $O(1)$ operation, just like for 1D arrays.

In the following subsections, we will describe how multi-dimensional arrays can be created and manipulated in Python code.

Note

There is also a special Scipy class called matrix which can be used to represent matrices. Don't use this. It's a layer on top of Scipy's multi-dimensional array facilities, mostly intended as a crutch for programmers transitioning from Matlab. Arrays are better to use, and more consistent with the rest of Scipy.

4.1.2 Creating Multi-Dimensional Arrays

You can create a 2D array with specific elements using the array command, with an input consisting of a list of lists:

>>> x = array([[1., 2., 3.], [4., 5., 6.]])
>>> x
array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.]])

The above code creates a 2D array (i.e. a matrix) named x. It is a $2\times 3$ array, containing the elements $x_{00}=1$ , $x_{01}=2$ , $x_{02}=3$ , etc. Similarly, you can create a 3D array by supplying an input consisting of a list of lists of lists; and so forth.

It is more common, however, to create multi-dimensional arrays using ones or zeros. These functions return arrays whose elements are all initialized to $0.0$ and $1.0$ , respectively. (You can then assign values to the elements as desired.) To do this, instead of specifying a number as the input (which would create a 1D array of that size), you should specify a tuple as the input. For example,

>>> x = zeros((2,3))
>>> x
array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.]])
>>>
>>> y = ones((3,2))
>>> y
array([[ 1.,  1.],
       [ 1.,  1.],
       [ 1.,  1.]])

There are many more ways to create multi-dimensional arrays, which we'll discuss when needed.

4.1.3 Basic Array Operations

To check on the dimension of an array, consult its ndim slot:

>>> x = zeros((5,4))
>>> x.ndim
2

To determine the exact shape of the array, use the shape slot (the shape is stored in the form of a tuple):

>>> x.shape
(3, 5)

To access the elements of a multi-dimensional array, use square-bracket notation: M[2,3], T[0,4,2], etc. Just remember that each component is zero-indexed.

Multi-dimensional arrays can be sliced, similar to 1D arrays. There is one important feature of multi-dimensional slicing: if you specify a single value as one of the indices, the slice results in an array of smaller dimensionality. For example:

>>> x = array([[1., 2., 3.], [4., 5., 6.]])
>>> x[:,0]
array([ 1.,  4.])

In the above code, x is a 2D array of size $2\times 3$ . The slice $x[:,0]$ specifies the value $0$ for index $1$ , so the result is a 1D array containing the elements $[x_{00}, x_{10}]$ .

If you don't specify all the indices of a multi-dimensional array, the omitted indices implicitly included, and run over their entire range. For example, for the above x array,

>>> x[1]
array([ 4.,  5.,  6.])

This is also equivalent to x[1,:].

4.1.4 Arithmetic Operations

The basic arithmetic operations can all be performed on multi-dimensional arrays, and act on the arrays element-by-element. For example,

>>> x = ones((2,3))
>>> y = ones((2,3))
>>> z = x + y
>>> z
array([[ 2.,  2.,  2.],
       [ 2.,  2.,  2.]])

You can think of this in terms of index notation:

$z_{ij} = x_{ij} + y_{ij}.$

What is the runtime for performing such arithmetic operations on multi-dimensional arrays? With a bit of thinking, we can convince ourselves that the runtime scales linearly with the number of elements in the multi-dimensional array, because the arithmetic operation is performed on each individual index. For example, the runtime for adding a pair of $M\times N$ matrices scales as $(O(MN)$ .

Note

The multiplication operator * also acts element-by-element. It does not refer to matrix multiplication!

For example,

>>> x = ones((2,3))
>>> y = ones((2,3))
>>> z = x * y
>>> z
array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])

In index notation, we can think of the * operator as doing this:

$z_{ij} = x_{ij} y_{ij}.$

By contrast, matrix multiplication is $z_{ij} = \sum_k x_{ik} y_{kj}$ . We'll see how this is accomplished in the next subsection.

4.1.5 The Dot Operation

The most commonly-used function for array multiplication is the dot function, which takes two array inputs x and y and returns their "dot product". It constructs a product by summing over the last index of array x, and over the next-to-last index of array y (or over its last index, if y is a 1D array). This may sound like a complicated rule, but you should be able to convince yourself that it corresponds to the appropriate type of multiplication operation for the most common cases encountered in linear algebra:

If x and y are both 1D arrays (vectors), then dot corresponds to the usual dot product between two vectors:
$\texttt{z} = \texttt{dot(x,y)} \;\;\; \leftrightarrow \;\;\; z = \sum_{k} x_k\, y_k$
If x is a 2D array and y is a 1D array, then dot corresponds to right-multiplying a matrix by a vector:
$\texttt{z} = \texttt{dot(x,y)} \;\;\; \leftrightarrow \;\;\; z_i = \sum_{k} x_{ik}\, y_k$
If x is a 1D array and y is a 2D array, then dot corresponds to left-multiplication:
$\texttt{z} = \texttt{dot(x,y)} \;\;\; \leftrightarrow \;\;\; z_i = \sum_{k} x_{k} \,y_{ki}$
If x and y are both 2D arrays, dot corresponds to matrix multiplication:
$\texttt{z} = \texttt{dot(x,y)} \;\;\; \leftrightarrow \;\;\; z_{ij} = \sum_{k} x_{ik}\, y_{kj}$
The rule applies to higher-dimensional arrays as well. For example, two rank-3 tensors are multiplied together in this way:
$\texttt{z} = \texttt{dot(x,y)} \;\;\; \leftrightarrow \;\;\; z_{ijpq} = \sum_{k} x_{ijk} \,y_{pkq}$

Should you need to perform more general products than what the dot function provides, you can use the tensordot function. This takes two array inputs, x and y, and a tuple of two integers specifying which components of x and y to sum over. For example, if x and y are 2D arrays,

$\texttt{z} = \texttt{tensordot(x, y, (0,1))} \;\;\; \leftrightarrow \;\;\; z_{ij} = \sum_{k} x_{ki} \,y_{jk}$

What is the runtime for dot and tensordot? Consider a simple case: matrix multiplication of an $M\times N$ matrix with an $N\times P$ matrix. In index notation, this has the form

$C_{ij} = \sum_{k=0}^{N-1} A_{ik} \, B_{kj},\quad \mathrm{for}\;i\in\{0,\dots,M-1\}, \;\;j\in\{0,\dots,P-1\}.$

The resulting matrix has a total of $(M\times P)$ indices to be computed. Each of these calculations requires a sum involving $O(N)$ arithmetic operations. Hence, the total runtime scales as $O(MNP)$ . By similar reasoning, we can figure out the runtime scaling for any tensor product between two tensors: it is the product of the sizes of the unsummed indices, times the size of the summed index. For example, for a tensordot product between an $M\times N\times P$ tensor and a $Q\times S\times P$ tensor, summing over the last index of each tensor, the runtime would scale as $O(MNPQS)$ .

Search

Text Color

Text Size

Margin Size

Font Type

4.1.1 Multi-Dimensional Arrays

4.1.2 Creating Multi-Dimensional Arrays

4.1.3 Basic Array Operations

4.1.4 Arithmetic Operations

4.1.5 The Dot Operation

Support Center

How can we help?