4.5: Exercises

Last updated
Save as PDF

Page ID: 24252

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Exercise 4.1

Verify that for any \(A\), an \(m \times n\) matrix, the following holds:

\[\frac{1}{\sqrt{n}}\|A\|_{1} \leq\|A\|_{2} \leq \sqrt{m}\|A\|_{\infty}.\nonumber\]

Exercise 4.2

Suppose \(A^{\prime} = A\). Find the exact relation between the eigenvalues and singular values of \(A\). Does this hold if \(A\) is not conjugate symmetric?

Exercise 4.3

Show that if \(rank(A) = 1\), then, \(\|A\|_{F}=\|A\|_{2}\)

Exercise 4.4

This problem leads you through the argument for the existence of the SVD, using an iterative construction. Showing that \(A = U \Sigma V^{\prime}\), where \(U\) and \(V\) are unitary matrices is equivalent to showing that \(U^{\prime}AV = \Sigma\).

a) Argue from the definition of \(\|A\|_{2}\) that there exist unit vectors (measured in the 2-norm) \(x \in C^{n}\) and \(y \in C^{m}\) such that \(Ax = \sigma y\), where \(\sigma= \|A\|_{2}\).

b) We can extend both \(x\) and \(y\) above to orthonormal bases, i.e. we can find unitary matrices \(V_{1}\) and \(U_{1}\) whose first columns are \(x\) and \(y\) respectively:

\[V_{1}=\left[x \tilde{V}_{1}\right], \quad U_{1}=\left[y \tilde{U}_{1}\right]\nonumber\]

Show that one way to do this is via Householder transformations, as follows:

\[V_{1}=I-2 \frac{h h^{\prime}}{h^{\prime} h}, \quad h=x-[1,0, \ldots, 0]^{\prime}\nonumber\]

and likewise for \(U_{1}\).

c) ) Now define \(A_{1}=U_{1}^{\prime} A V_{1}\). Why is \(\|A_{1}\|_{2}=\|A\|_{2}\)?

d) Note that

\[A_{1}=\left(\begin{array}{cc}
y^{\prime} A x & y^{\prime} A \tilde{V}_{1} \\
\tilde{U}_{1}^{\prime} A x & \tilde{U}_{1}^{\prime} A \tilde{V}_{1}
\end{array}\right)=\left(\begin{array}{cc}
\sigma & w^{\prime} \\
0 & B
\end{array}\right)\nonumber\]

What is the justification for claiming that the lower left element in the above matrix is 0?

e) Now show that

\[\left\|A_{1}\left(\begin{array}{c}
\sigma \\
w
\end{array}\right)\right\|_{2} \geq \sigma^{2}+w^{\prime} w\nonumber\]

and combine this with the fact that \(\|A_{1}\|_{2}=\|A\|_{2}= \sigma\) to deduce that \(w = 0\), so

\[A_{1}=\left(\begin{array}{cc}
\sigma & 0 \\
0 & B
\end{array}\right)\nonumber\]

At the next iteration, we apply the above procedure to \(B\), and so on. When the iterations terminate, we have the SVD.

[The reason that this is only an existence proof and not an algorithm is that it begins by invoking the existence of \(x\) and \(y\), but does not show how to compute them. Very good algorithms do exist for computing the SVD - see Golub and Van Loan's classic, Matrix Computations, Johns Hopkins Press, 1989. The SVD is a cornerstone of numerical computations in a host of applications.]

Exercise 4.5

Suppose the \(m \times n\) matrix \(A\) is decomposed in the form

\[A=U\left(\begin{array}{ll}
\Sigma & 0 \\
0 & 0
\end{array}\right) V^{\prime}\nonumber\]

where \(U\) and \(V\) are unitary matrices, and \(\Sigma\) is an invertible \(r \times r\) matrix (- the SVD could be used to produce such a decomposition). Then the "Moore-Penrose inverse", or pseudo-inverse of \(A\), denoted by \(A^{+}\), can be defined as the \(n \times m\) matrix

\[A^{+}=V\left(\begin{array}{cc}
\Sigma^{-1} & 0 \\
0 & 0
\end{array}\right) U^{\prime}\nonumber\]

(You can invoke it in Matlab with \(pinv(A)\).)

a) Show that \(A^{+}A\) and \(AA^{+}\) are symmetric, and that \(AA^{+}A=A\) and \(A^{+}AA^{+}=A^{+}\). (These four conditions actually constitute an alternative definition of the pseudo-inverse.)

b) Show that when \(A\) has full column rank then \(A^{+}=\left(A^{\prime} A\right)^{-1} A^{\prime}\), and that when \(A\) has full row rank then \(A^{+}= A^{\prime} \left(A^{\prime} A\right)^{-1}\).

c) Show that, of all \(x\) that minimize \(\|y-A x\|_{2}\) (and there will be many, if \(A\) does not have full column rank), the one with smallest length \(\|x\|_{2}\) is given by \(\hat{x} = A^{+} y\)

Exercise 4.6

All the matrices in this problem are real. Suppose

\[Y=Q\left(\begin{array}{l}
R \\
0
\end{array}\right)\nonumber\]

with \(Q\) being an \(m \times m\) orthogonal matrix and \(R\) an \(n \times n\) invertible matrix. (Recall that such a decomposition exists for any matrix \(A\) that has full column rank.) Also let \(Y\) be an \(m \times p\) matrix of the form

\[Y=Q\left(\begin{array}{l}
Y_{1} \\
Y_{2}
\end{array}\right)\nonumber\]

where the partitioning in the expression for \(Y\) is conformable with the partitioning for \(A\)

(a) What choice \(\hat{X}\) of the \(n \times p\) matrix \(X\) minimizes the Frobenius norm, or equivalently the squared Frobenius norm, of \(Y - AX\)? In other words, find

\[\hat{X}=\operatorname{argmin}\|Y-A X\|_{F}^{2}\nonumber\]

Also determine the value of \(\|Y-A X\|_{F}^{2}\). (Your answers should be expressed in terms of the matrices \(Q\), \(R\), \(Y_{1}\) and \(Y_{2}\).)

(b) Can your \(\hat{X}\) in (a) also be written as \(\left(A^{\prime} A\right)^{-1} A^{\prime}Y\)? Can it be written as \(A^{+} Y\), where \(A^{+}\) denotes the (Moore-Penrose) pseudo-inverse of A?

\[\|Y-A X\|_{F}^{2} + \|Z-B X\|_{F}^{2}\nonumber\]

where \(Z\) and \(B\) are given matrices of appropriate dimensions. (Your answer can be expressed in terms of \(A\), \(B\), \(Y\), and \(Z\).)

Exercise 4.7 Structured Singular Values

Given a complex square matrix \(A\), define the structured singular value function as follows.

\[\mu_\underline{\Delta}(A)=\frac{1}{\min _{\Delta \in \Delta}\left\{\sigma_{\max }(\Delta) \mid \operatorname{det}(I-\Delta A)=0\right\}} \nonumber\]

where \(\underline{\Delta}\) is some set of matrices.

a) If \(\underline{\Delta}=\{\alpha I: \alpha \in \mathbb{C}\}\), show that \(\mu_\underline{\Delta}(A)=\rho(A)\), where \(\rho\) is the spectral radius of \(A\), defined as: \(\rho(A)=\max _{i}\left|\lambda_{i}\right|\) and the \(\lambda_{i}\)'s are the eigenvalues of \(A\).

b) If \(\underline{\Delta}=\left\{\Delta \in \mathbb{C}^{n \times n}\right\}\), show that \(\mu_{\underline{\Delta}}(A)=\sigma_{\max }(A)\)

c) If \(\underline{\Delta}=\left\{\operatorname{diag}\left(\alpha_{1}, \cdots, \alpha_{n}\right) \mid \alpha_{i} \in \mathbb{C}\right\}\), show that

\[\rho(A) \leq \mu_\underline{\Delta}(A)=\mu_\underline{\Delta}\left(D^{-1} A D\right) \leq \sigma_{\max }\left(D^{-1} A D\right)\nonumber\]

where

\[D \in\left\{\operatorname{diag}\left(d_{1}, \cdots, d_{n}\right) \mid d_{i}>0\right\} \nonumber\]

Exercise 4.8

Consider again the structured singular value function of a complex square matrix \(A\) defined in the preceding problem. If \(A\) has more structure, it is sometimes possible to compute \(\mu_\underline{\Delta}\left(A_{ }\right)\) exactly. In this problem, assume \(A\) is a rank-one matrix, so that we can write \(A = uv^{\prime}\) where \(u\) and \(v\) are complex vectors of dimension \(n\). Compute \(\mu_\underline{\Delta}\left(A_{ }\right)\) when

(a) \(\underline{\Delta}=\operatorname{diag}\left(\delta_{1}, \ldots, \delta_{n}\right), \quad \delta_{i} \in \mathbb{C}\).

(b) \(\underline{\Delta}=\operatorname{diag}\left(\delta_{1}, \ldots, \delta_{n}\right), \quad \delta_{i} \in \mathbb{R}\).

To simplify the computation, minimize the Frobenius norm of \(\Delta\) in the defintion of \(\mu_\underline{\Delta}\left(A_{ }\right).\)