4.3: Singular Value Decomposition

Last updated
Save as PDF

Page ID: 24250

$ \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } $ $ \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} $$\newcommand{\id}{\mathrm{id}}$ $ \newcommand{\Span}{\mathrm{span}}$ $ \newcommand{\kernel}{\mathrm{null}\,}$ $ \newcommand{\range}{\mathrm{range}\,}$ $ \newcommand{\RealPart}{\mathrm{Re}}$ $ \newcommand{\ImaginaryPart}{\mathrm{Im}}$ $ \newcommand{\Argument}{\mathrm{Arg}}$ $ \newcommand{\norm}[1]{\| #1 \|}$ $ \newcommand{\inner}[2]{\langle #1, #2 \rangle}$ $ \newcommand{\Span}{\mathrm{span}}$ $\newcommand{\id}{\mathrm{id}}$ $ \newcommand{\Span}{\mathrm{span}}$ $ \newcommand{\kernel}{\mathrm{null}\,}$ $ \newcommand{\range}{\mathrm{range}\,}$ $ \newcommand{\RealPart}{\mathrm{Re}}$ $ \newcommand{\ImaginaryPart}{\mathrm{Im}}$ $ \newcommand{\Argument}{\mathrm{Arg}}$ $ \newcommand{\norm}[1]{\| #1 \|}$ $ \newcommand{\inner}[2]{\langle #1, #2 \rangle}$ $ \newcommand{\Span}{\mathrm{span}}$$\newcommand{\AA}{\unicode[.8,0]{x212B}}$

Before we discuss the singular value decomposition of matrices, we begin with some matrix facts and definitions.

Some Matrix Facts:

A matrix $U \in C^{n \times n}$ is unitary if $U^{\prime}U = UU^{\prime}= I$. Here, as in Matlab, the superscript $^{\prime}$ denotes the (entry-by-entry) complex conjugate of the transpose, which is also called the Hermitian transpose or conjugate transpose.
A matrix $U \in R^{n \times n}$ is orthogonal if $U^{T}U = UU^{T}= I$, where the superscript $^{T}$ denotes the transpose.
Property: If $U$ is unitary, then $\|U x\|_{2}=\|x\|_{2}$.
If $S = S^{\prime}$ (i.e. $S$ equals its Hermitian transpose, in which case we say $S$ is Hermitian), then there exists a unitary matrix such that $U^{\prime}SU= {[diagonal matrix].}^{1}$
For any matrix $A$, both $A^{\prime}A$ and $AA^{\prime}$ are Hermitian, and thus can always be diagonalized by unitary matrices.
For any matrix $A$, the eigenvalues of $A^{\prime}A$ and $AA^{\prime}$ are always real and non-negative (proved easily by contradiction).

Theorem 4.1 (Singular Value Decomposition, or SVD)

Given any matrix $A \in C^{n \times n}$, $A$ can be written as

\[A=\stackrel{m \times m}{U} \ \stackrel{m \times n}{\Sigma} \ \stackrel{n \times n}{V^{\prime}}, \ \tag{4.14}\]

where $U^{\prime}U =I$, $V^{\prime}V =I$

\[\Sigma=\left[\begin{array}{cccc}
\sigma_{1} & & & \\
& \ddots & & 0 \\
& & \sigma_{r} & \\
\hline & & & \\
& 0 & & 0
\end{array}\right]\ \tag{4.15}\]

and $$\sigma_{i}=\sqrt{i \text { th nonzero eigenvalue of } A^{\prime} A}$$. The $$\sigma_{i}$ are termed the singular values of $A$, and are arranged in order of descending magnitude, i.e.,

\[\sigma_{1} \geq \sigma_{2} \geq \cdots \geq \sigma_{r}>0\nonumber\]

Proof

We will prove this theorem for the case $rank(A) = m$; the general case involves very little more than what is required for this case. The matrix $AA^{\prime}$ is Hermitian, and it can therefore be diagonalized by a unitary matrix \(U \in C^{m \times m}\, so that

\[U \Lambda_{1} U^{\prime}=A A^{\prime}.\nonumber \]

Note that $\Lambda_{1}=\operatorname{diag}\left(\lambda_{1}, \lambda_{2}, \ldots, \lambda_{m}\right)$ has real positive diagonal entries $\Lambda_{i}$ due to the fact that $AA^{\prime}$ is positive definite. We can write $\Lambda_{1}=\Sigma_{1}^{2}=\operatorname{diag}\left(\sigma_{1}^{2}, \sigma_{2}^{2}, \ldots, \sigma_{m}^{2}\right)$. Define $V_{1}^{\prime} \in C^{m \times n}$ has orthonormal rows as can be seen from the following calculation: $V_{1}^{\prime} V_{1}=\Sigma_{1}^{-1} U^{\prime} A A^{\prime} U \Sigma_{1}^{-1}=I$. Choose the matrix $V_{2}^{\prime}$ in such a way that

is in $C^{n \times n}$n and unitary. Define the $m \times n$ matrix $\Sigma=\left[\Sigma_{1} \mid 0\right].$ This implies that

\[\Sigma V^{\prime}=\Sigma_{1} V_{1}^{\prime}=U^{\prime} A\nonumber\]

In other words we have $A = U \Sigma V^{\prime}$

Example 4.2

For the matrix A given at the beginning of this lecture, the SVD - computed easily in Matlab by writing $[u, s, v] = svd(A)$ - is

\[A=\left(\begin{array}{cc}
.7068 & .7075 \\
.7075 & .-7068
\end{array}\right)\left(\begin{array}{cc}
200.1 & 0 \\
0 & 0.1
\end{array}\right)\left(\begin{array}{cc}
.7075 & .7068 \\
-.7068 & .7075
\end{array}\right) \ \tag{4.16}\]

Solution

Observations:

i) \[\begin{aligned}
A A^{\prime} &=U \Sigma V^{\prime} V \Sigma^{T} U^{\prime} \\
&=U \Sigma \Sigma^{T} U^{\prime}
\end{aligned}\nonumber\]

\[=U\left[\begin{array}{ccc|c}
\sigma_{1}^{2} & & & \\
& \ddots & & 0 \\
& & \sigma_{r}^{2} & \\
\hline & & & \\
& 0 & & 0
\end{array}\right] U^{\prime}\ \tag{4.17}\]

which tells us U diagonalizes $AA^{\prime}$;

ii) \[\begin{aligned}
A^{\prime} A &=V \Sigma^{T} U^{\prime} U \Sigma V^{\prime} \\
&=V \Sigma^{T} \Sigma V^{\prime}
\end{aligned}\nonumber\]

\[=V\left[\begin{array}{ccc|c}
\sigma_{1}^{2} & & & \\
& \ddots & & 0 \\
& & \sigma_{r}^{2} & \\
\hline & & & \\
& 0 & & 0
\end{array}\right] V^{\prime}\ \tag{4.18}\]

which tells us V diagonalizes $AA^{\prime}$;

iii) If $U$ and $V$ are expressed in terms of their columns, i.e.,

\[\begin{array}{l}
U=\left[\begin{array}{llll}
u_{1} & u_{2} & \dots & u_{m}
\end{array}\right] \end{array}\nonumber\]

and

\[\begin{array}{l}
V=\left[\begin{array}{llll}
v_{1} & v_{2} & \dots & v_{n}
\end{array}\right] \end{array}\nonumber\]

then

\[A=\sum_{i=1}^{r} \sigma_{i} u_{i} v_{i}^{\prime}\ \tag{4.19}\]

which is another way to write the SVD. The $u_{i}$ are termed the left singular vectors of $A$, and the $v_{i}$ are its right singular vectors. From this we see that we can alternately interpret $Ax$ as

\[A x=\sum_{i=1}^{r} \sigma_{i} u_{i} \underbrace{\left(v_{i}^{\prime} x\right)}_{\text {projection }},\ \tag{4.20}\]

which is a weighted sum of the $u_{i}$, where the weights are the products of the singular values and the projections of $x$ onto the $v_{i}$.

Observation (iii) tells us that $\mathcal{R} a(A)=\operatorname{span}\left\{u_{1}, \ldots u_{r}\right\}$ (because $A x=\sum_{i=1}^{r} c_{i} u_{i}$ - where the $c_{i}$ are scalar weights). Since the columns of $U$ are independent, $\operatorname{dim}\mathcal{R} a(A)=r=rank(A)$, and $\left\{u_{1}, \ldots u_{r}\right\}$ constitute a basis for the range space of $A$. The null space of $A$ is given by \(\left\{v_{r+1}, \ldots v_{n}\right\}\. To see this:

\[\begin{aligned}
U \Sigma V^{\prime} x=0 & \Longleftrightarrow \Sigma V^{\prime} x=0 \\
\Longleftrightarrow &\left[\begin{array}{c}
\sigma_{1} v_{1}^{\prime} x \\
\vdots \\
\sigma_{r} v_{r}^{\prime} x
\end{array}\right]=0 \\
\Longleftrightarrow & v_{i}^{\prime} x=0, \quad i=1, \ldots, r \\
\Longleftrightarrow & x \in \operatorname{span}\left\{v_{r+1}, \ldots, v_{n}\right\}
\end{aligned}\nonumber\]

Example 4.3

One application of singular value decomposition is to the solution of a system of algebraic equations. Suppose $A$ is an $m \times n$ complex matrix and $b$ is a vector in $C^{m}$. Assume that the rank of $A$ is equal to $k$, with $k < m$. We are looking for a solution of the linear system $Ax = b$. By applying the singular value decomposition procedure to $A$, we get

\[\begin{aligned}
A &=U \Sigma V^{\prime} \\
&=U\left[\begin{array}{c|c}
\Sigma_{1} & 0 \\
0 & 0
\end{array}\right] V^{\prime}
\end{aligned}\nonumber\]

where $\Sigma_{1}$ is a $k \times k$ non-singular diagonal matrix. We will express the unitary matrices $U$ and $V$ columnwise as

\[\begin{array}{l}
U=\left[\begin{array}{llll}
u_{1} & u_{2} & \dots & u_{m}
\end{array}\right] \\
V=\left[\begin{array}{llll}
v_{1} & v_{2} & \dots & v_{n}
\end{array}\right]
\end{array}\nonumber\]

Solution

A necessary and sufficient condition for the solvability of this system of equations is that $u^{\prime}_{i}b=0$ for all $i$ satisfying $k < i \leq m$. Otherwise, the system of equations is inconsistent. This condition means that the vector $b$ must be orthogonal to the last $m - k$ columns of $U$. Therefore the system of linear equations can be written as

\[\left[\begin{array}{c|l}
\Sigma_{1} & 0 \\
\hline & \\
0 & 0
\end{array}\right] V^{\prime} x=U^{\prime} b\nonumber\]

\[\left[\begin{array}{c|l}
\Sigma_{1} & 0 \\
\hline & \\
0 & 0
\end{array}\right] V^{\prime} x=\left[\begin{array}{c}
u_{1}^{\prime} b \\
u_{2}^{\prime} b \\
\vdots \\
\vdots \\
u_{m}^{\prime} b
\end{array}\right]=\left[\begin{array}{c}
u_{1}^{\prime} b \\
\vdots \\
u_{k}^{\prime} b \\
0 \\
\vdots \\
0
\end{array}\right]\nonumber\]

Using the above equation and the invertibility of $\Sigma_{1}$, we can rewrite the system of equations as

\[\left[\begin{array}{c}
v_{1}^{\prime} \\
v_{2}^{\prime} \\
\vdots \\
v_{k}^{\prime}
\end{array}\right] x=\left[\begin{array}{c}
\frac{1}{\sigma_{1}} u_{1}^{\prime} b \\
\frac{1}{\sigma_{2}} u_{2}^{\prime} b \\
\cdots \\
\frac{1}{\sigma_{k}} u_{k}^{\prime} b
\end{array}\right]\nonumber\]

By using the fact that

\[\left[\begin{array}{c}
v_{1}^{\prime} \\
v_{2}^{\prime} \\
\vdots \\
v_{k}^{\prime}
\end{array}\right]\left[\begin{array}{llll}
v_{1} & v_{2} & \ldots & v_{k}
\end{array}\right]=I\nonumber\]

we obtain a solution of the form

\[x=\sum_{i=1}^{k} \frac{1}{\sigma_{i}} u_{i}^{\prime} b v_{i}\nonumber\]

From the observations that were made earlier, we know that the vectors $v_{k+1}, v_{k+2}, \dots, v_{n}$ span the kernel of $A$, and therefore a general solution of the system of linear equations is given by

\[x=\sum_{i=1}^{k} \frac{1}{\sigma_{i}}\left(u_{i}^{\prime} b\right) v_{i}+\sum_{i=k+1}^{n} \beta_{i} v_{i}\nonumber\]

where the coefficients $\beta_{i}$, with $i$ in the interval $k+1 \leq i \leq n$, are arbitrary complex numbers.