16.3: Special Matrices

Last updated
Save as PDF

Page ID: 55681

Masayuki Yano, James Douglass Penn, George Konidaris, & Anthony T Patera
Massachusetts Institute of Technology via MIT OpenCourseWare

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Let us now introduce a few special matrices that we shall encounter frequently in numerical methods.

Diagonal Matrices

A square matrix \(A\) is said to be diagonal if the off-diagonal entries are zero, i.e. \[A_{i j}=0, \quad i \neq j .\]

Example 16.3.1 diagonal matrices

Examples of diagonal matrix are \[A=\left(\begin{array}{ll} 1 & 0 \\ 0 & 3 \end{array}\right), \quad B=\left(\begin{array}{ccc} 2 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 7 \end{array}\right), \quad \text { and } \quad C=(4) \text {. }\] The identity matrix is a special case of a diagonal matrix with all the entries in the diagonal equal to 1 . Any \(1 \times 1\) matrix is trivially diagonal as it does not have any off-diagonal entries.

Symmetric Matrices

A square matrix \(A\) is said to be symmetric if the off-diagonal entries are symmetric about the diagonal, i.e. \[A_{i j}=A_{j i}, \quad i=1, \ldots, m, \quad j=1, \ldots, m .\] The equivalent statement is that \(A\) is not changed by the transpose operation, i.e. \[A^{\mathrm{T}}=A .\] We note that the identity matrix is a special case of symmetric matrix. Let us look at a few more examples.

Example 16.3.2 Symmetric matrices

Examples of symmetric matrices are \[A=\left(\begin{array}{cc} 1 & -2 \\ -2 & 3 \end{array}\right), \quad B=\left(\begin{array}{ccc} 2 & \pi & 3 \\ \pi & 1 & -1 \\ 3 & -1 & 7 \end{array}\right), \quad \text { and } \quad C=(4)\] Note that any scalar, or a \(1 \times 1\) matrix, is trivially symmetric and unchanged under transpose.

Symmetric Positive Definite Matrices

A \(m \times m\) square matrix \(A\) is said to be symmetric positive definite (SPD) if it is symmetric and furthermore satisfies \[v^{\mathrm{T}} A v>0, \quad \forall v \in \mathbb{R}^{m}(v \neq 0) .\] Before we discuss its properties, let us give an example of a SPD matrix.

Example 16.3.3 Symmetric positive definite matrices

An example of a symmetric positive definite matrix is \[A=\left(\begin{array}{cc} 2 & -1 \\ -1 & 2 \end{array}\right) \text {. }\] We can confirm that \(A\) is symmetric by inspection. To check if \(A\) is positive definite, let us consider the quadratic form \[\begin{aligned} q(v) & \equiv v^{\mathrm{T}} A v=\sum_{i=1}^{2} v_{i}\left(\sum_{j=1}^{2} A_{i j} v_{j}\right)=\sum_{i=1}^{2} \sum_{j=1}^{2} A_{i j} v_{i} v_{j} \\ &=A_{11} v_{1}^{2}+A_{12} v_{1} v_{2}+A_{21} v_{2} v_{1}+A_{22} v_{2}^{2} \\ &=A_{11} v_{1}^{2}+2 A_{12} v_{1} v_{2}+A_{22} v_{2}^{2} \end{aligned}\] where the last equality follows from the symmetry condition \(A_{12}=A_{21}\). Substituting the entries of \(A\), \[q(v)=v^{\mathrm{T}} A v=2 v_{1}^{2}-2 v_{1} v_{2}+2 v_{2}^{2}=2\left[\left(v_{1}-\frac{1}{2} v_{2}\right)^{2}-\frac{1}{4} v_{2}^{2}+v_{2}^{2}\right]=2\left[\left(v_{1}-\frac{1}{2} v_{2}\right)^{2}+\frac{3}{4} v_{2}^{2}\right]\] Because \(q(v)\) is a sum of two positive terms (each squared), it is non-negative. It is equal to zero only if \[v_{1}-\frac{1}{2} v_{2}=0 \quad \text { and } \quad \frac{3}{4} v_{2}^{2}=0 .\] The second condition requires \(v_{2}=0\), and the first condition with \(v_{2}=0\) requires \(v_{1}=0\). Thus, we have \[q(v)=v^{\mathrm{T}} A v>0, \quad \forall v \in \mathbb{R}^{2},\] and \(v^{\mathrm{T}} A v=0\) if \(v=0\). Thus \(A\) is symmetric positive definite.

Symmetric positive definite matrices are encountered in many areas of engineering and science. They arise naturally in the numerical solution of, for example, the heat equation, the wave equation, and the linear elasticity equations. One important property of symmetric positive definite matrices is that they are always invertible: \(A^{-1}\) always exists. Thus, if \(A\) is an SPD matrix, then, for any \(b\), there is always a unique \(x\) such that \[A x=b .\] In a later unit, we will discuss techniques for solution of linear systems, such as the one above. For now, we just note that there are particularly efficient techniques for solving the system when the matrix is symmetric positive definite.

Triangular Matrices

Triangular matrices are square matrices whose entries are all zeros either below or above the diagonal. A \(m \times m\) square matrix is said to be upper triangular if all entries below the diagonal are zero, i.e. \[A_{i j}=0, \quad i>j .\] A square matrix is said to be lower triangular if all entries above the diagonal are zero, i.e. \[A_{i j}=0, \quad j>i .\] We will see later that a linear system, \(A x=b\), in which \(A\) is a triangular matrix is particularly easy to solve. Furthermore, the linear system is guaranteed to have a unique solution as long as all diagonal entries are nonzero.

Example 16.3.4 triangular matrices

Examples of upper triangular matrices are \[A=\left(\begin{array}{cc} 1 & -2 \\ 0 & 3 \end{array}\right) \quad \text { and } \quad B=\left(\begin{array}{ccc} 1 & 0 & 2 \\ 0 & 4 & 1 \\ 0 & 0 & -3 \end{array}\right)\] Examples of lower triangular matrices are \[C=\left(\begin{array}{cc} 1 & 0 \\ -7 & 6 \end{array}\right) \quad \text { and } \quad D=\left(\begin{array}{ccc} 2 & 0 & 0 \\ 7 & -5 & 0 \\ 3 & 1 & 4 \end{array}\right)\]

Begin Advanced Material

Orthogonal Matrices

A \(m \times m\) square matrix \(Q\) is said to be orthogonal if its columns form an orthonormal set. That is, if we denote the \(j\)-th column of \(Q\) by \(q_{j}\), we have \[Q=\left(\begin{array}{llll} q_{1} & q_{2} & \cdots & q_{m} \end{array}\right),\] where \[q_{i}^{\mathrm{T}} q_{j}=\left\{\begin{array}{ll} 1, & i=j \\ 0, & i \neq j \end{array} .\right.\] Orthogonal matrices have a special property \[Q^{\mathrm{T}} Q=I .\] This relationship follows directly from the fact that columns of \(Q\) form an orthonormal set. Recall that the \(i j\) entry of \(Q^{\mathrm{T}} Q\) is the inner product of the \(i\)-th row of \(Q^{\mathrm{T}}\) (which is the \(i\)-th column of \(Q)\) and the \(j\)-th column of \(Q\). Thus, \[\left(Q^{\mathrm{T}} Q\right)_{i j}=q_{i}^{\mathrm{T}} q_{j}=\left\{\begin{array}{ll} 1, & i=j \\ 0, & i \neq j \end{array},\right.\] which is the definition of the identity matrix. Orthogonal matrices also satisfy \[Q Q^{\mathrm{T}}=I,\] which in fact is a minor miracle.

Example 16.3.5 Orthogonal matrices

Examples of orthogonal matrices are \[Q=\left(\begin{array}{cc} 2 / \sqrt{5} & -1 / \sqrt{5} \\ 1 / \sqrt{5} & 2 / \sqrt{5} \end{array}\right) \text { and } I=\left(\begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array}\right)\] We can easily verify that the columns the matrix \(Q\) are orthogonal to each other and each are of unit length. Thus, \(Q\) is an orthogonal matrix. We can also directly confirm that \(Q^{\mathrm{T}} Q=Q Q^{\mathrm{T}}=I\). Similarly, the identity matrix is trivially orthogonal.

Let us discuss a few important properties of orthogonal matrices. First, the action by an orthogonal matrix preserves the 2 -norm of a vector, i.e. \[\|Q x\|_{2}=\|x\|_{2}, \quad \forall x \in \mathbb{R}^{m} .\] This follows directly from the definition of 2-norm and the fact that \(Q^{\mathrm{T}} Q=I\), i.e. \[\|Q x\|_{2}^{2}=(Q x)^{\mathrm{T}}(Q x)=x^{\mathrm{T}} Q^{\mathrm{T}} Q x=x^{\mathrm{T}} I x=x^{\mathrm{T}} x=\|x\|_{2}^{2} .\] Second, orthogonal matrices are always invertible. In fact, solving a linear system defined by an orthogonal matrix is trivial because \[Q x=b \quad \Rightarrow \quad Q^{\mathrm{T}} Q x=Q^{\mathrm{T}} b \quad \Rightarrow \quad x=Q^{\mathrm{T}} b .\] In considering linear spaces, we observed that a basis provides a unique description of vectors in \(V\) in terms of the coefficients. As columns of \(Q\) form an orthonormal set of \(m m\)-vectors, it can be thought of as an basis of \(\mathbb{R}^{m}\). In solving \(Q x=b\), we are finding the representation of \(b\) in coefficients of \(\left\{q_{1}, \ldots, q_{m}\right\}\). Thus, the operation by \(Q^{\mathrm{T}}\) (or \(Q\) ) represent a simple coordinate transformation. Let us solidify this idea by showing that a rotation matrix in \(\mathbb{R}^{2}\) is an orthogonal matrix.

Example 16.3.6 Rotation matrix

Rotation of a vector is equivalent to representing the vector in a rotated coordinate system. A rotation matrix that rotates a vector in \(\mathbb{R}^{2}\) by angle \(\theta\) is \[R(\theta)=\left(\begin{array}{cc} \cos (\theta) & -\sin (\theta) \\ \sin (\theta) & \cos (\theta) \end{array}\right) .\] Let us verify that the rotation matrix is orthogonal for any \(\theta\). The two columns are orthogonal because \[r_{1}^{\mathrm{T}} r_{2}=\left(\begin{array}{cc} \cos (\theta) & \sin (\theta) \end{array}\right)\left(\begin{array}{c} -\sin (\theta) \\ \cos (\theta) \end{array}\right)=-\cos (\theta) \sin (\theta)+\sin (\theta) \cos (\theta)=0, \quad \forall \theta .\] Each column is of unit length because \[\begin{aligned} &\left\|r_{1}\right\|_{2}^{2}=(\cos (\theta))^{2}+(\sin (\theta))^{2}=1 \\ &\left\|r_{2}\right\|_{2}^{2}=(-\sin (\theta))^{2}+(\cos (\theta))^{2}=1, \quad \forall \theta . \end{aligned}\] Thus, the columns of the rotation matrix is orthonormal, and the matrix is orthogonal. This result verifies that the action of the orthogonal matrix represents a coordinate transformation in \(\mathbb{R}^{2}\). The interpretation of an orthogonal matrix as a coordinate transformation readily extends to higher-dimensional spaces.

Orthonormal Matrices

Let us define orthonormal matrices to be \(m \times n\) matrices whose columns form an orthonormal set, i.e. \[Q=\left(\begin{array}{llll} q_{1} & q_{2} & \cdots & q_{n} \end{array}\right),\] with \[q_{i}^{\mathrm{T}} q_{j}= \begin{cases}1, & i=j \\ 0, & i \neq j .\end{cases}\] Note that, unlike an orthogonal matrix, we do not require the matrix to be square. Just like orthogonal matrices, we have \[Q^{\mathrm{T}} Q=I,\] where \(I\) is an \(n \times n\) matrix. The proof is identical to that for the orthogonal matrix. However, \(Q Q^{\mathrm{T}}\) does not yield an identity matrix, \[Q Q^{\mathrm{T}} \neq I\] unless of course \(m=n\).

Example 16.3.7 orthonormal matrices

An example of an orthonormal matrix is \[Q=\left(\begin{array}{cc} 1 / \sqrt{6} & -2 / \sqrt{5} \\ 2 / \sqrt{6} & 1 / \sqrt{5} \\ 1 / \sqrt{6} & 0 \end{array}\right)\] We can verify that \(Q^{\mathrm{T}} Q=I\) because \[Q^{\mathrm{T}} Q=\left(\begin{array}{ccc} 1 / \sqrt{6} & 2 / \sqrt{6} & 1 / \sqrt{6} \\ -2 / \sqrt{5} & 1 / \sqrt{5} & 0 \end{array}\right)\left(\begin{array}{cc} 1 / \sqrt{6} & -2 / \sqrt{5} \\ 2 / \sqrt{6} & 1 / \sqrt{5} \\ 1 / \sqrt{6} & 0 \end{array}\right)=\left(\begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array}\right)\] However, \(Q Q^{\mathrm{T}} \neq I\) because \[Q Q^{\mathrm{T}}=\left(\begin{array}{cc} 1 / \sqrt{6} & -2 / \sqrt{5} \\ 2 / \sqrt{6} & 1 / \sqrt{5} \\ 1 / \sqrt{6} & 0 \end{array}\right)\left(\begin{array}{ccc} 1 / \sqrt{6} & 2 / \sqrt{6} & 1 / \sqrt{6} \\ -2 / \sqrt{5} & 1 / \sqrt{5} & 0 \end{array}\right)=\left(\begin{array}{ccc} 29 / 30 & -1 / 15 & 1 / 6 \\ -1 / 15 & 13 / 15 & 1 / 3 \\ 1 / 6 & 1 / 3 & 1 / 6 \end{array}\right)\]