4.1: Introduction

Last updated
Save as PDF

Page ID: 24248

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

In this lecture, we introduce the notion of a norm for matrices. The singular value decomposition or SVD of a matrix is then presented. The SVD exposes the 2-norm of a matrix, but its value to us goes much further: it enables the solution of a class of matrix perturbation problems that form the basis for the stability robustness concepts introduced later; it solves the so-called total least squares problem, which is a generalization of the least squares estimation problem considered earlier; and it allows us to clarify the notion of conditioning, in the context of matrix inversion. These applications of the SVD are presented at greater length in the next lecture.

Example 4.1

To provide some immediate motivation for the study and application of matrix norms, we begin with an example that clearly brings out the issue of matrix conditioning with respect to inversion. The question of interest is how sensitive the inverse of a matrix is to perturbations of the matrix.

Solution

Consider inverting the matrix

\[A=\left(\begin{array}{cc}
100 & 100 \\
100.2 & 100
\end{array}\right) \nonumber\]

A quick calculation shows that

\[A^{-1}=\left(\begin{array}{cc}
-5 & 5 \\
5.01 & -5
\end{array}\right) \nonumber\]

Now suppose we invert the perturbed matrix

\[A+\Delta A=\left(\begin{array}{cc}
100 & 100 \\
100.1 & 100
\end{array}\right) \nonumber\]

The result now is

\[(A+\Delta A)^{-1}=A^{-1}+\Delta\left(A^{-1}\right)=\left(\begin{array}{cc}
-10 & 10 \\
10.01 & -10
\end{array}\right) \nonumber\]

Here \(\Delta A\) denotes the perturbation in \(A\) and \(\Delta A^{-1}\) denotes the resulting perturbation in \(A^{-1}\). Evidently a 0.1% change in one entry of \(A\) has resulted in a 100% change in the entries of \(A^{-1}\). If we want to solve the problem \(Ax = b\) where \(b = {[1 - 1]}^{T}\), then \(x=A^{-1} b=[-10 \quad 10.01]^{T}\), while after perturbation of \(A\) we get \(x+\Delta x=(A+\Delta A)^{-1} b=[-20 \quad 20.01]^{T}\). Again, we see a 100% change in the entries of the solution with only a 0.1% change in the starting data

The situation seen in the above example is much worse than what can ever arise in the scalar case. If \(a\) is a scalar, then \(d\left(a^{-1}\right) /\left(a^{-1}\right)=-d a / a\), so the fractional change in the inverse of \(a\) has the same magnitude as the fractional change in \(a\) itself. What is seen in the above example, therefore, is a purely matrix phenomenon. It would seem to be related to the fact that \(A\) is nearly singular - in the sense that its columns are nearly dependent, its determinant is much smaller than its largest element, and so on. In what follows (see next lecture), we shall develop a sound way to measure nearness to singularity, and show how this measure relates to sensitivity under inversion.

Before understanding such sensitivity to perturbations in more detail, we need ways to measure the "magnitudes" of vectors and matrices. We have already introduced the notion of vector norms in Lecture 1, so we now turn to the definition of matrix norms.