13.2: Matrices

Last updated
Save as PDF

Page ID: 47301

Franz S. Hover & Michael S. Triantafyllou
Massachusetts Institute of Technology via MIT OpenCourseWare

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Definition

A matrix, or array, is equivalent to a set of column vectors of the same dimension, arranged side by side, say

\[ A \, = \, [\vec{a} \,\, \vec{b}] \, = \, \begin{bmatrix} 2 & 3\\[4pt] 1 & 3\\[4pt] 7 & 2 \end{bmatrix}. \]

This matrix has three rows (\(m\) = 3) and two columns (\(n\) = 2); a vector is a special case of a matrix with one column. Matrices, like vectors, permit addition and scalar multiplication. We usually use an upper-case symbol to denote a matrix.

Multiplying a Vector by a Matrix

If \(A_{ij}\) denotes the element of matrix \(A\) in the \(i\)'th row and the \(j\)'th column, then the multiplication \(\vec{c} = A \vec{v}\) is constructed as:

\[ c_i \, = \, A_{i1} v_1 + A_{i2} v_2 + \cdots + A_{in} v_n \, = \, \sum_{j=1}^{n} A_{ij} v_j, \]

where \(n\) is the number of columns in \(A\). \(\vec{c}\) will have as many rows as \(A\) has rows (\(m\)). Note that this multiplication is defined only if \(\vec{v}\) has as many rows as \(A\) has columns; they have consistent inner dimension \(n\). The product \(\vec{v}A\) would be well-posed only if \(A\) had one row, and the proper number of columns. There is another important interpretation of this vector multiplication – let the subscript : indicate all rows, so that each \(A_{:j}\) is the \(j\)'th column vector. Then

\[ \vec{c} \, = \, A \vec{v} \, = \, A_{:1} v_1 + A_{:2} v_2 + \cdots + A_{:n} v_n. \]

We are multiplying column vectors of \(A\) by the scalar elements of \(\vec{v}\).

Multiplying a Matrix by a Matrix

The multiplication \(C = AB\) is equivalent to a side-by-side arrangement of column vectors \(C_{:j} = AB_{:j}\), so that

\[ C \, = \, AB \, = \begin{bmatrix} AB_{:1} & AB_{:2} & \cdots & AB_{:k} \end{bmatrix}, \]

where \(k\) is the number of columns in matrix \(B\). The same inner dimension condition applies as noted above: the number of columns in \(A\) must equal the number of rows in \(B\). Matrix multiplication is:

Associative. \((AB) C = A (BC). \)
Distributive. \(A(B+C) = AB + AC, \, (B+C)A = BA + CA. \)
NOT commutative. \(AB \neq BA\), except in special cases.

Common Matrices

Identity. The identity matrix is usually denoted \(I\), and comprises a square matrix with ones on the diagonal, and zeros elsewhere, e.g.,

\[ I_{3 \times 3} \, = \, \begin{bmatrix} 1 & 0 & 0 \\[4pt] 0 & 1 & 0 \\[4pt] 0 & 0 & 1 \end{bmatrix}. \]

The identity always satisfies \(AI_{n \times n} \, = \, I_{m \times m}A \, = \, A.\)

Diagonal Matrices. A diagonal matrix is square, and has all zeros off the diagonal. For instance, the following is a diagonal matrix:

\[ A \, = \, \begin{bmatrix} 4 & 0 & 0 \\[4pt] 0 & -2 & 0 \\[4pt] 0 & 0 & 3 \end{bmatrix}. \]

The product of a diagonal matrix with another diagonal matrix is diagonal, and in this case the operation is commutative.

Transpose

The transpose of a vector or matrix, indicated by a \(T\) superscript, results from simply swapping the row-column indices of each entry; it is equivalent to “flipping” the vector or matrix around the diagonal line. For example,

\[ \vec{a} \, = \, \begin{Bmatrix} 1 \\[4pt] 2 \\[4pt] 3 \end{Bmatrix} \longrightarrow \vec{a}^T \, = \, \begin{Bmatrix} 1 & 2 & 3 \end{Bmatrix} \]

\[ A \, = \, \begin{bmatrix} 1 & 2 \\[4pt] 4 & 5 \\[4pt] 8 & 9 \end{bmatrix} \longrightarrow A^T \, = \, \begin{bmatrix} 1 & 4 & 8 \\[4pt] 2 & 5 & 9 \end{bmatrix}. \]

A very useful property of the transpose is \[ (AB)^T \, = \, B^T A^T. \]

Determinant

The determinant of a square matrix \(A\) is a scalar equal to the volume of the parallelepiped enclosed by the constituent vectors. The two-dimensional case is particularly easy to remember, and illustrates the principle of volume:

\[ det(A) \, = \, A_{11} A{22} - A_{21} A_{12} \]

Example \(\PageIndex{8}.1\)

\[ det \left( \begin{bmatrix} 1 & -1 \\[4pt] 1 & 1 \end{bmatrix} \right) \, = \, 1 + 1 \, = \, 2. \nonumber \]

Graph of the parallelepiped formed by the vectors <-1, 1 — Figure \(\PageIndex{1}\): the two-dimensional parallelepiped formed by the vectors given above: \(\langle 1, 1 \rangle\) and \(\langle 1, -1 \rangle\).

In higher dimensions, the determinant is more complicated to compute. The general formula allows one to pick a row \(k\), perhaps the one containing the most zeros, and apply \[ det(A) \, = \, \sum_{j=1}^{j=n} A_{kj} (-1)^{k+j} \Delta_{kj}, \]

where \(\Delta_{kj}\) is the determinant of the sub-matrix formed by neglecting the \(k\)'th row and the \(j\)'th column. The formula is symmetric, in the sense that one could also target the \(k\)'th column: \[ det(A) \, = \, \sum_{j=1}^{j=n} A_{jk} (-1)^{k+j} \Delta_{jk}.\]

If the determinant of a matrix is zero, then the matrix is said to be singular – there is no volume, and this results from the fact that the constituent vectors do not span the matrix dimension. For instance, in two dimensions, a singular matrix has the vectors colinear; in three dimensions, a singular matrix has all its vectors lying in a (two-dimensional) plane. Note also that \(det(A) = det(A^T)\). If \(det(A) \neq 0,\) then the matrix is said to be nonsingular.

Inverse

The inverse of a square matrix \(A\), denoted \(A^{-1}\), satisfies \(AA^{-1} = A^{-1}A = I.\) Its computation requires the determinant above, and the following definition of the \(n \times n\) adjoint matrix:

\[ adj(A) \, = \, \begin{bmatrix} (-1)^{1+1} \Delta_{11} & \cdots & (-1)^{1+n} \Delta_{1n} \\[4pt] \cdots & \cdots & \cdots \\[4pt] (-1)^{n+1} \Delta_{n1} & \cdots & (-1)^{n+n} \Delta_{nn} \end{bmatrix} ^T . \]

Once this computation is made, the inverse follows from \[ A^(-1) \, = \, \frac{adj(A)}{det(A)}. \]

If \(A\) is singular, i.e., \(det(A) = 0\), then the inverse does not exist. The inverse finds common application in solving systems of linear equations such as \[ A \vec{x} = \vec{b} \longrightarrow \vec{x} = A^{-1} \vec{b}. \]

Eigenvalues and Eigenvectors

A typical eigenvalue problem is stated as \[A \vec{x} = \lambda \vec{x}, \]

where \(A\) is an \(n \times n\) matrix, \(\vec{x}\) is a column vector with \(n\) elements, and \(\lambda\) is a scalar. We ask for what nonzero vectors \(\vec{x}\) (right eigenvectors) and scalars \(\lambda\) (eigenvalues) will the equation be satisfied. Since the above is equivalent to \((A - \lambda I) \vec{x} = \vec{0}\), it is clear that \(det (A - \lambda I) = 0.\) This observation leads to the solutions for \(\lambda\); here is an example for the two-dimensional case:

Example \(\PageIndex{8}.1\)

\begin{align*} A \, &= \, \begin{bmatrix} 4 & -5 \\[4pt] 2 & -3 \end{bmatrix} \longrightarrow \\[4pt][4 pt] A - \lambda I \, &= \, \begin{bmatrix} 4- \lambda & -5 \\[4pt] 2 & -3 - \lambda \end{bmatrix} \longrightarrow \\[4pt][4 pt] det(A - \lambda I) \, &= \, (4 - \lambda)(-3 - \lambda) + 10 \\[4pt] &= \, \lambda^2 - \lambda - 2 \\[4pt] &= \, (\lambda + 1)(\lambda - 2). \end{align*}

Thus, \(A\) has two eigenvalues, \(\lambda_1 = -1\) and \(\lambda_2 = 2\). Each is associated with a right eigenvector \(\vec{x}.\) In this example,

\begin{align*} (A - \lambda_1 I) \vec{x}_1 \, &= \, \vec{0} \longrightarrow \\[4pt][4 pt] \begin{bmatrix} 5 & -5 \\[4pt] 2 & -2 \end{bmatrix} \vec{x}_1 \, &= \, \vec{0} \longrightarrow \\[4pt] \vec{x}_1 \, &= \, \begin{Bmatrix} \sqrt{2}/2, \,\, \sqrt{2}/2 \end{Bmatrix} ^T \\[4pt] \quad \\[4pt] (A - \lambda_2 I) \vec{x}_2 \, &= \, \vec{0} \longrightarrow \\[4pt][4 pt] \begin{bmatrix} 2 & -5 \\[4pt] 2 & -5 \end{bmatrix} \vec{x}_2 \, &= \, \vec{0} \longrightarrow \\[4pt] \vec{x}_2 \, &= \, \begin{Bmatrix} 5 \sqrt{29} / 29, \,\, 2 \sqrt{29} / 29 \end{Bmatrix} ^T. \end{align*}

Eigenvectors are defined only within an arbitrary constant, i.e., if \(\vec{x}\) is an eigenvector then \(c \vec{x}\) is also an eigenvector for any \(c \neq 0\). They are often normalized to have unity magnitude, and positive first element (as above). The condition that \(rank(A - \lambda_i I) =\)\(rank(A) - 1\) indicates that there is only one eigenvector for the eigenvalue \(\lambda_i\); more precisely, a unique direction for the eigenvector, since the magnitude can be arbitrary. If the left-hand side rank is less than this, then there are multiple eigenvectors that go with \(\lambda_i\).

The above discussion relates only the right eigenvectors, generated from the equation \(A \vec{x} = \lambda \vec{x}\). Left eigenvalues, defined as \(\vec{y}^T A = \lambda \vec{y}^T\), are also useful for many problems, and can be defined simply as the right eigenvectors of \(A^T\). \(A\) and \(A^T\) share the same eigenvalues \(\lambda\), since they share the same determinant. Example:

Example \(\PageIndex{8}.2\)

\begin{align*} (A^T - \lambda_1 I) \vec{y}_1 \, &= \, \vec{0} \longrightarrow \\[4pt][4 pt] \begin{bmatrix} \,\, 5 & \,\, 2 \\[4pt] -5 & -2 \end{bmatrix} \vec{y}_1 \, &= \, \vec{0} \longrightarrow \\[4pt] \vec{y}_1 \, &= \, \begin{Bmatrix} 2 \sqrt{29} / 29, \,\, -5 \sqrt{29} / 29 \end{Bmatrix} ^T \\[4pt] \quad \\[4pt] (A^T - \lambda_2 I) \vec{y}_1 \, &= \, \vec{0} \longrightarrow \\[4pt][4 pt] \begin{bmatrix} \,\, 2 & \,\, 2 \\[4pt] -5 & -5 \end{bmatrix} \vec{y}_2 \, &= \, \vec{0} \longrightarrow \\[4pt] \vec{y}_2 \, &= \, \begin{Bmatrix} \sqrt{2} / 2, \,\, - \sqrt{2} / 2 \end{Bmatrix} ^T. \end{align*}

Modal Decomposition

For simplicity, we consider matrices that have unique eigenvectors for each eigenvalue. The right and left eigenvectors corresponding to a particular eigenvalue \(\lambda\) can be defined to have unity dot product, that is \(\vec{x}_i^T \vec{y}_i = 1\), with the normalization noted above. The dot products of a left eigenvector with the right eigenvectors corresponding to different eigenvalues are zero. Thus, if the set of right and left eigenvectors, \(V\) and \(W\), respectively, is

\begin{align} V \, &= \, [ \vec{x}_1 \cdots \vec{x}_n], \,\,\, \text{and} \\[4pt] W \, &= \, [ \vec{y}_1 \cdots \vec{y}_n], \nonumber \end{align}

then we have \begin{align} W^T V \, &= \, I, \,\,\, \text{or} \\[4pt] W^T \, &= \, V^{-1}. \nonumber \end{align}

Next, construct a diagonal matrix containing the eigenvalues:

\[ \Lambda \, = \, \begin{bmatrix} \lambda_1 & 0 \\[4pt] . \\[4pt] 0 & \lambda_n \end{bmatrix}; \] it follows that

\begin{align} A V \, &= \, V \Lambda \longrightarrow \nonumber \\[4pt][4 pt] A \, &= \, V \Lambda W^T \\[4pt] &= \, \sum_{i=1}^n \lambda_i \vec{v}_i \vec{w}_i ^T. \nonumber \end{align}

Hence \(A\) can be written as a sum of modal components. By carrying out successive multiplications, it can be shown that \(A_k\) has its eigenvalues at \(\lambda_i ^k\), and keeps the same eigenvalues as \(A\).