18.4: Solution of (Linear) Least-Squares Problems

Last updated
Save as PDF

Page ID: 55689

Masayuki Yano, James Douglass Penn, George Konidaris, & Anthony T Patera
Massachusetts Institute of Technology via MIT OpenCourseWare

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

In Chapter 17 we considered the solution of least squares problems: given \(B \in \mathbb{R}^{m \times n}\) and \(g \in \mathbb{R}^{m}\) find \(z^{*} \in \mathbb{R}^{n}\) which minimizes \(\|B z-g\|^{2}\) over all \(z \in \mathbb{R}^{n}\). We showed that \(z^{*}\) satisfies the normal equations, \(N z^{*}=B^{\mathrm{T}} g\), where \(N \equiv B^{\mathrm{T}} B\). There are (at least) three ways we can implement this least-squares solution in MATLAB.

The first, and worst, is to write \(z s t a r=\operatorname{inv}\left(\mathrm{B}^{\prime} * \mathrm{~B}\right) *\left(\mathrm{~B}^{\prime} * \mathrm{~g}\right)\). The second, and slightly better, is to take advantage of our backslash operator to write zstar_too \(=\left(B^{\prime} * B\right) \backslash\left(B^{\prime} * g\right)\). However, both of the approaches are less than numerically stable (and more generally we should avoid taking powers of matrices since this just exacerbates any intrinsic conditioning or "sensitivity" issues). The third option, and by far the best, is to write zstar_best \(=B \backslash g\). Here the backslash operator "recognizes" that \(B\) is not a square matrix and automatically pursues a least-squares solution based on the stable and efficient \(Q R\) decomposition discussed in Chapter 17 .

Finally, we shall see in Chapter 19 on statistical regression that some elements of the matrix \(\left(B^{\mathrm{T}} B\right)^{-1}\) will be required to construct confidence intervals. Although it is possible to efficiently calculate certain select elements of this inverse matrix without construction of the full inverse matrix, in fact our systems shall be relatively small and hence inv \(\left(B^{\prime} * B\right)\) is quite inexpensive. (Nevertheless, the solution of the least-squares problem is still best implemented as zstar_best \(=B \backslash \mathrm{g}\), even if we subsequently form the inverse \(\operatorname{inv}\left(B^{\prime} * B\right.\) ) for purposes of confidence intervals.)