14.3: Matrix Diagonalization

Last updated
Save as PDF

Page ID: 22930

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

From our understanding of eigenvalues and eigenvectors (Section 14.2) we have discovered several things about our operator matrix, \(A\). We know that if the eigenvectors of \(A\) span \(\mathbb{C}^n\) and we know how to express any vector \(\mathbf{x}\) in terms of \(\left\{v_{1}, v_{2}, \ldots, v_{n}\right\}\), then we have the operator \(A\) all figured out. If we have \(A\) acting on \(\mathbf{x}\), then this is equal to \(A\) acting on the combinations of eigenvectors. Which we know proves to be fairly easy!

We are still left with two questions that need to be addressed:

When do the eigenvectors \(\left\{v_{1}, v_{2}, \ldots, v_{n}\right\}\) of \(A\) span \(\mathbb{C}^n\) (assuming \(\left\{v_{1}, v_{2}, \ldots, v_{n}\right\}\) are linearly independent)?
How do we express a given vector \(\mathbf{x}\) in terms of \(\left\{v_{1}, v_{2}, \ldots, v_{n}\right\}\)?

Answer to Question #1

Question #1

When do the eigenvectors \(\left\{v_{1}, v_{2}, \ldots, v_{n}\right\}\) of \(A\) span \(\mathbf{C}^n\)?

If \(A\) has \(n\) distinct eigenvalues

\[\lambda_{i} \neq \lambda_{j}, i \neq j \nonumber \]

where \(i\) and \(j\) are integers, then \(A\) has \(n\) linearly independent eigenvectors \(\left\{v_{1}, v_{2}, \ldots, v_{n}\right\}\) which then span \(\mathbb{C}^{n}\).

Aside

The proof of this statement is not very hard, but is not really interesting enough to include here. If you wish to research this idea further, read Strang, G., "Linear Algebra and its Application" for the proof.

Furthermore, \(n\) distinct eigenvalues means

\[\operatorname{det}(A-\lambda I)=c_{n} \lambda^{n}+c_{n-1} \lambda^{n-1}+\ldots+c_{1} \lambda+c_{0}=0 \nonumber \]

has \(n\) distinct roots.

Answer to Question #2

Question #2

How do we express a given vector \(\mathbf{x}\) in terms of \(\left\{v_{1}, v_{2}, \ldots, v_{n}\right\}\)?

We want to find \(\left\{\alpha_{1}, \alpha_{2}, \ldots, \alpha_{n}\right\} \in \mathbb{C}\) such that

\[\mathbf{x}=\alpha_{1} v_{1}+\alpha_{2} v_{2}+\ldots+\alpha_{n} v_{n} \label{14.4} \]

In order to find this set of variables, we will begin by collecting the vectors \(\left\{v_{1}, v_{2}, \ldots, v_{n}\right\}\) as columns in a \(n \times n\) matrix \(V\).

\[V=\left(\begin{array}{cccc}
\vdots & \vdots & & \vdots \\
v_{1} & v_{2} & \dots & v_{n} \\
\vdots & \vdots & & \vdots
\end{array}\right) \nonumber \]

Now Equation \ref{14.4} becomes

\[\mathbf{x}=\left(\begin{array}{ccccc}
\vdots & & & \vdots \\
v_{1} & v_{2} & \ldots & v_{n} \\
\vdots & \vdots & & \vdots
\end{array}\right)\left(\begin{array}{c}
\alpha_{1} \\
\vdots \\
\alpha_{n}
\end{array}\right) \nonumber \]

\[\mathbf{x}=V \mathbf{\alpha} \nonumber \]

which gives us an easy form to solve for our variables in question, \(\mathbf{\alpha}\):

\[\mathbf{\alpha}=V^{-1} \mathbf{x} \nonumber \]

Note that \(V\) is invertible since it has \(n\) linearly independent columns.

Aside

Let us recall our knowledge of functions and their basis and examine the role of \(V\).

\[\mathbf{x}=V \mathbf{\alpha} \nonumber \]

\[\left(\begin{array}{c}
x_{1} \\
\vdots \\
x_{n}
\end{array}\right)=V\left(\begin{array}{c}
\alpha_{1} \\
\vdots \\
\alpha_{n}
\end{array}\right) \nonumber \]

where \(\alpha\) is just \(x\) expressed in a different basis:

\[x=x_{1}\left(\begin{array}{c}
1 \\
0 \\
\vdots \\
0
\end{array}\right)+x_{2}\left(\begin{array}{c}
0 \\
1 \\
\vdots \\
0
\end{array}\right)+\cdots+x_{n}\left(\begin{array}{c}
0 \\
0 \\
\vdots \\
1
\end{array}\right) \nonumber \]

\[x=\alpha_{1}\left(\begin{array}{c}
\vdots \\
v_{1} \\
\vdots
\end{array}\right)+\alpha_{2}\left(\begin{array}{c}
\vdots \\
v_{2} \\
\vdots
\end{array}\right)+\cdots+\alpha_{n}\left(\begin{array}{c}
\vdots \\
v_{n} \\
\vdots
\end{array}\right) \nonumber \]

\(V\) transforms \(x\) from the standard basis to the basis \(\left\{v_{1}, v_{2}, \ldots, v_{n}\right\}\)

Matrix Diagonalization and Output

We can also use the vectors \(\left\{v_{1}, v_{2}, \ldots, v_{n}\right\}\) to represent the output, \(\mathbf{b}\), of a system:

\[\mathbf{b}=A \mathbf{x}=A\left(\alpha_{1} v_{1}+\alpha_{2} v_{2}+\ldots+\alpha_{n} v_{n}\right) \nonumber \]

\[A \mathbf{x}=\alpha_{1} \lambda_{1} v_{1}+\alpha_{2} \lambda_{2} v_{2}+\ldots+\alpha_{n} \lambda_{n} v_{n}=\mathbf{b} \nonumber \]

\[A x=\left(\begin{array}{cccc}
\vdots & \vdots & & \vdots \\
v_{1} & v_{2} & \dots & v_{n} \\
\vdots & \vdots & & \vdots
\end{array}\right)\left(\begin{array}{c}
\lambda_{1} \alpha_{1} \\
\vdots \\
\lambda_{1} \alpha_{n}
\end{array}\right) \nonumber \]

\[A \mathbf{x}=V \Lambda \mathbf{\alpha} \nonumber \]

\[A \mathbf{x}=V \Lambda V^{-1} \mathbf{x} \nonumber \]

where \(\Lambda\) is the matrix with the eigenvalues down the diagonal:

\[\Lambda=\left(\begin{array}{cccc}
\lambda_{1} & 0 & \dots & 0 \\
0 & \lambda_{2} & \dots & 0 \\
\vdots & \vdots & \ddots & \vdots \\
0 & 0 & \dots & \lambda_{n}
\end{array}\right) \nonumber \]

Finally, we can cancel out the \(\mathbf{x}\) and are left with a final equation for \(A\):

\[A=V \Lambda V^{-1} \nonumber \]

Interpretation

For our interpretation, recall our key formulas:

\[\mathbf{\alpha}=V^{-1} \mathbf{x} \nonumber \]

\[b=\sum_{i} \alpha_{i} \lambda_{i} v_{i} \nonumber \]

We can interpret operating on \(\mathbf{x}\) with \(A\) as:

\[\left(\begin{array}{c}
x_{1} \\
\vdots \\
x_{n}
\end{array}\right) \rightarrow\left(\begin{array}{c}
\alpha_{1} \\
\vdots \\
\alpha_{n}
\end{array}\right) \rightarrow\left(\begin{array}{c}
\lambda_{1} \alpha_{1} \\
\vdots \\
\lambda_{1} \alpha_{n}
\end{array}\right) \rightarrow\left(\begin{array}{c}
b_{1} \\
\vdots \\
b_{n}
\end{array}\right) \nonumber \]

where the three steps (arrows) in the above illustration represent the following three operations:

Transform \(\mathbf{x}\) using \(V^{-1}\), which yields \(\alpha\)
Multiplication by \(\Lambda\)
Inverse transform using \(V\), which gives us \(\mathbf{b}\)

This is the paradigm we will use for LTI systems!

Figure \(\PageIndex{1}\): Simple illustration of LTI system!