Skip to main content
Engineering LibreTexts

4.2: Matrix Norms

  • Page ID
    24249
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    An \(m \times n\) complex matrix may be viewed as an operator on the (finite dimensional) normed vector space \(C^{n}\):

    \[A^{m \times n}:\left(\mathbb{C}^{n},\|\cdot\|_{2}\right) \longrightarrow\left(\mathbb{C}^{m},\|\cdot\|_{2}\right) \ \tag{4.5}\]

    where the norm here is taken to be the standard Euclidean norm. Define the induced 2-norm of A as follows:

    \[\|A\|_{2} \triangleq \sup _{x \neq 0} \frac{\|A x\|_{2}}{\|x\|_{2}} \ \tag{4.6}\]

    \[=\max _{\|x\|_{2}=1}\|A x\|_{2} \ \tag{4.7}\]

    The term "induced" refers to the fact that the definition of a norm for vectors such as \(Ax\) and \(x\) is what enables the above definition of a matrix norm. From this definition, it follows that the induced norm measures the amount of "amplification" the matrix \(A\) provides to vectors on the unit sphere in \(C^{n}\), i.e. it measures the "gain" of the matrix.

    Rather than measuring the vectors \(x\) and \(Ax\) using the 2-norm, we could use any \(p\)-norm, the interesting cases being \(p = 1, 2, \infty\). Our notation for this is

    \[\|A\|_{p}=\max _{\|x\|_{p}=1}\|A x\|_{p} \ \tag{4.8}\]

    An important question to consider is whether or not the induced norm is actually a norm, in the sense defined for vectors in Lecture 1. Recall the three conditions that define a norm:

    1. \(\|x\| \geq 0, \text { and }\|x\|=0 \Longleftrightarrow x=0\);
    2. \(\|\alpha x\|=|\alpha| \| x |\mid\);
    3. \(\|x+y\| \leq\|x\|+\|y\|\)

    Now let us verify that \(\|A\|_{p}\) is a norm on \(C^{m \times n}\) using the preceding definition:

    1. \(\|A\|_{p} \geq 0\) since \(\|Ax\|_{p} \geq 0\) for any \(x\). Futhermore, \(\|A\|_{p}=0 \Longleftrightarrow A=0\), since \(\|A\|_{p}\) calculated from the maximum of \(\|Ax\|_{p}\) evaluated on the unit sphere.
    2. \(\|\alpha A\|_{p}=|\alpha|\|A\|_{p}\) follows from \(\|\alpha y\|_{p}=|\alpha|\|y\|_{p}\) (for any \(y\)).
    3. The triangle inequality holds since:

    \[\begin{aligned}
    \|A+B\|_{p} &=\max _{\|x\|_{p}=1}\|(A+B) x\|_{p} \\
    & \leq \max _{\|x\|_{p}=1}\left(\|A x\|_{p}+\|B x\|_{p}\right) \\
    & \leq\|A\|_{p}+\|B\|_{p}
    \end{aligned}\nonumber\]

    Induced norms have two additional properties that are very important:

    1. \(\|A x\|_{p} \leq\|A\|_{p}\|x\|_{p}\), which is a direct consequence of the definition of an induced norm;
    2. For \(A^{m \times n}\), \(B^{n \times r}\)

    \[\|A B\|_{p} \leq\|A\|_{p}\|B\|_{p} \ \tag{4.9}\]

    which is called the submultiplicative property. This also follows directly from the definition:

    \[\begin{aligned}
    \|A B x\|_{p} & \leq\|A\|_{p}\|B x\|_{p} \\
    & \leq\|A\|_{p}\|B\|_{p}\|x\|_{p} \text { for any x}
    \end{aligned}\nonumber\]

    Dividing by \(\|x\|_{p}\):

    \[\frac{\|A B x\|_{p}}{\|x\|_{p}} \leq\|A\|_{p}\|B\|_{p_{p}}\nonumber\]

    from which the result follows.

    Before we turn to a more detailed study of ideas surrounding the induced 2-norm, which will be the focus of this lecture and the next, we make some remarks about the other induced norms of practical interest, namely the induced 1-norm and induced \(\infty\)-norm. We shall also say something about an important matrix norm that is not an induced norm, namely the Frobenius norm.

    It is a fairly simple exercise to prove that

    \[\|A\|_{1}=\max _{1 \leq j \leq n} \sum_{i=1}^{m}\left|a_{i j}\right| \quad(\text { max of absolute column sums of } A), \ \tag{4.10}\]

    and

    \[\|A\|_{\infty}=\max _{1 \leq i \leq m} \sum_{j=1}^{n}\left|a_{i j}\right| \quad(\max \text { of absolute row sums of A}). \ \tag{4.11}\]

    (Note that these definitions reduce to the familiar ones for the 1-norm and \(\infty\)-norm of column vectors in the case \(n = 1\).)

    The proof for the induced \(\infty\)-norm involves two stages, namely:

    1. Prove that the quantity in Equation (4.11) provides an upper bound \(\gamma\): \[\|A x\|_{\infty} \leq \gamma\|x\|_{\infty} \quad \forall x;\nonumber\]
    2. Show that this bound is achievable for some \(x = \hat{x}\): \[\|A \hat{x}\|_{\infty}=\gamma\|\hat{x}\|_{\infty} \quad \text { for some } \hat{x}\nonumber\]

    In order to show how these steps can be implemented, we give the details for the \(\infty\)-norm case. Let \(x \in C^{n}\) and consider

    \[\begin{aligned}
    \|A x\|_{\infty} &=\max _{1 \leq i \leq m}\left|\sum_{j=1}^{n} a_{i j} x_{j}\right| \\
    & \leq \max _{1 \leq i \leq m} \sum_{j=1}^{n}\left|a_{i j}\right|\left|x_{j}\right| \\
    & \leq\left(\max _{1 \leq i \leq m} \sum_{j=1}^{n}\left|a_{i j}\right|\right) \max _{1 \leq j \leq n}\left|x_{j}\right| \\
    &=\left(\max _{1 \leq i \leq m} \sum_{j=1}^{n}\left|a_{i j}\right|\right)\|x\|_{\infty}
    \end{aligned}\nonumber\]

    The above inequalities show that an upper bound \(\gamma\) is given by

    \[\max _{\|x\|_{\infty}=1}\|A x\|_{\infty} \leq \gamma=\max _{1 \leq i \leq m} \sum_{j=1}^{n}\left|a_{i j}\right|\nonumber\]

    Now in order to show that this upper bound is achieved by some vector \( \hat{x}\), let \( \bar{i}\) be an index at which the expression of \( \gamma\) achieves a maximum, that is \(\gamma=\sum_{j=1}^{n}\left|a_{i j}\right|\). Define the vector \( \hat{x}\) as

    \[\hat{x}=\left[\begin{array}{c}
    \operatorname{sgn}\left(a_{i 1}^{-}\right) \\
    \operatorname{sgn}\left(a_{i 2}^{-}\right) \\
    \vdots \\
    \operatorname{sgn}\left(a_{i n}^{-}\right)
    \end{array}\right]\nonumber\]

    Clearly \(\|\hat{x}\|_{\infty}=1\) and

    \[\|A \hat{x}\|_{\infty}=\sum_{j=1}^{n}\left|a_{i j}^{-}\right|=\gamma\nonumber\]

    The proof for the 1-norm proceeds in exactly the same way, and is left to the reader.

    There are matrix norms - i.e. functions that satisfy the three defining conditions stated earlier - that are not induced norms. The most important example of this for us is the Frobenius norm:

    \[\|A\|_{F} \triangleq\left(\sum_{j=1}^{n} \sum_{i=1}^{m}\left|a_{i j}\right|^{2}\right)^{\frac{1}{2}} \ \tag{4.12}\]

    \[=\left(\operatorname{trace}\left(A^{\prime} A\right)\right)^{\frac{1}{2}} \quad(\text { verify }) \ \tag{4.13}\]

    In other words, the Frobenius norm is defined as the root sum of squares of the entries, i.e. the usual Euclidean 2-norm of the matrix when it is regarded simply as a vector in \(C^{mn}\). Although it can be shown that it is not an induced matrix norm, the Frobenius norm still has the submultiplicative property that was noted for induced norms. Yet other matrix norms may be defined (some of them without the submultiplicative property), but the ones above are the only ones of interest to us.


    This page titled 4.2: Matrix Norms is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Mohammed Dahleh, Munther A. Dahleh, and George Verghese (MIT OpenCourseWare) via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.