Skip to main content
Engineering LibreTexts

5.1: Fitting with Weighted Least Squares

  • Page ID
    122616
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)
    1. Least squares regression
      1. Using discrete data, can obtain a best estimate of functional relationship of data
      2. Assume \(\hat{y} = f(x)\) with polynomial form:\[\hat{y}(x) = a_{0} + a_{1} x + a_{2} x^{2} + \ldots a_{m}x^{m} = \sum_{i=0}^{m} a_{i} x^{i} \] where subscript identifies explicit coefficient multiplying continuous value of \(x\) raised to similar power
      3. \(n\) different values of \(x\) and \(y\) will solve \(m\) coefficients if \(m\leq n-1\)
      4. Deviation of the data to fit function over all data points \[D = \sum_{i=1}^{N} \left[y_{i} - \hat{y}\left(x_{i}\right)\right]^{2} \] \[D = \sum_{i=1}^{N} \left[y_{i} - \left(a_{0} + a_{1} x + a_{2} x^{2} + \ldots a_{m}x^{m}\right)\right]^{2} \] How to minimize the value \(D\)? Thru derivative of course \[dD = \frac{\partial D}{\partial a_{0}}da_{0} + \ldots + \frac{\partial D}{\partial a_{m}}da_{m} \]where full differential is applied.
      5. In order for \(dD \rightarrow 0\), each partial derivative should approach zero \[ \frac{\partial D}{\partial a_{0}} = 0 = \frac{\partial}{\partial a_{0}} \left\{\sum_{i=1}^{N} \left[y_{i} - \left(a_{0} + a_{1} x + a_{2} x^{2} + \ldots a_{m}x^{m}\right)\right]^{2} \right\} \] \[ \frac{\partial D}{\partial a_{m}} = 0 = \frac{\partial}{\partial a_{m}} \left\{\sum_{i=1}^{N} \left[y_{i} - \left(a_{0} + a_{1} x + a_{2} x^{2} + \ldots a_{m}x^{m}\right)\right]^{2} \right\} \] which yields \(m+1\) simultaneous equations to solve \(a_{0} \Rightarrow a_{m}\)
      6. How to set it up?
        1. Recall the polynomial form, but in terms of each \(i^{\mathrm{th}}\) data point:\[y_{i} = a_{0}x_{i}^{0} + a_{1}x_{i}^{1} + a_{2}x_{i}^{2} + \ldots a_{m}x_{i}^{m} \] where \(x_{i}\) is independent variable of trial and \(y_{i}\) is dependent measurand
        2. Need to determine coefficients, so set-up linear algebra equation:\[\left[ \begin{array}{ccccc} 1 & x_{1} & x_{1}^{2} & \ldots& x_{1}^{m} \\ 1 & x_{2} & x_{2}^{2} & \ldots& x_{2}^{m} \\ 1 & x_{3} & x_{3}^{2} & \ldots& x_{3}^{m} \\ & & \vdots & & \\ 1 & x_{n} & x_{n}^{2} & \ldots& x_{n}^{m} \end{array}\right] \left[\begin{array}{c} a_{0} \\ a_{1} \\ a_{2} \\ \vdots \\ a_{m} \end{array} \right ] = \left[\begin{array}{c} y_{1} \\ y_{2} \\ y_{3} \\ \vdots \\ y_{n} \end{array} \right ] \] which can be written more succinctly as: \[ M\vec{a} = \vec{y} \] where:
          1. \(M\) is matrix of \(n\) rows (total trials) and \(m+1\) columns (\(m^{\mathrm{th}}\) order of the polynomial)
          2. \(\vec{a}\) is vector of \(m+1\) unknown coefficients
          3. \(\vec{y}\) is vector list of \(n\) dependent measurands
          4. Rewriting with dimensional sizes indicated: \[ M_{n\times m+1} \ \vec{a}_{m+1 \times 1} = \vec{y}_{n\times 1} \]
        3. How to isolate \(\vec{a}\)? Use the inverse of course ...
        4. But it is impossible to invert a non-square matrix
      7. Time to use pseudoinverse
        1. Use the transpose to force a square form on left hand side \[ M^{T}\ M\ \vec{a} = M^{T}\ \vec{y} \] where transpose is flip of rows and columns across diagonal:\[ N = \left[\begin{array}{cccc} 1 & 2 & 3 & 4 \\ 1 & 4 & 9 & 16 \\ 1 & -1 & -2 & 1 \end{array}\right]\] \[ N^{T} = \left[\begin{array}{ccc} 1 & 1 & 1 \\ 2 & 4 & -1 \\ 3 & 9 & -2 \\ 4 & 16 & -1 \end{array}\right] \]
        2. Resulting left side is now square but same action on right side: \[ (M^{T}M)_{m+1\times m+1} \ \vec{a}_{m+1\times 1} = M^{T}_{m+1 \times n}\ \vec{y}_{n\times 1} \]
        3. More efficient manner of composing numerous sums of input data
        4. Inverse is applied recalling matrix multiplication is NOT commutative: \[ \vec{a} = (M^{T}M)^{-1} M^{T} \ \vec{y}\]
        5. Example \(\PageIndex{1}\)

          Prove same results achieved when Add trendline or polyfit function used when compared to pseudoinverse

          • independent data has the form \(x = \{0.4,1.1,1.9,3, 5,6\}\)
          • dependent data results in \(y = \{2.7,3.6,4.4,5.2,9.2,12.1\}\)
          • What are number of clicks or code updates to change order of polynomial?
          Solution

          Add example text here.

    2. Weighted least squares
      1. Knowledge of pseudoinverse allows emphasis of data
      2. Diagonal matrix of values can increase/decrease impact of data points \[ W^{1/2}M\ \vec{a} = W^{1/2} \ \vec{y} \] where \(W\) has something like the form: \[ W = \left[\begin{array}{cccc} 10 & 0 & 0 & 0 \\ 0 & 10^{3} & 0 & 0 \\ 0 & 0 & 2 & 0 \\ 0 & 0 & 0 & 1 \end{array}\right] \] where \(x_{3}\) is twice as important as \(x_{4}\), \(x_{1}\) is 5 times more important then \(x_{3}\), and \(x_{2}\) is 2 orders of magnitude more important than \(x_{1}\)
      3. How to choose values to populate \(W\)?
        1. always use positive values (hence the \(^{1/2}\) power)
        2. larger values for those with greater confidence/importance
        3. lower values to reduce influence of spurious points
        4. Reasons for choosing:
          1. Size of uncertainty relative to measurement
          2. Spread of standard deviation statistics
          3. Trustworthiness of collaborative teammate
      4. Solution follows similar pseudoinverse format \[ \vec{a} = (M^{T}W^{1/2} M)^{-1} M^{T}W^{1/2}\ \vec{y}\]
      5. Example \(\PageIndex{1}\)

        How does changing \(W\) affect the polynomials and proximity of fit of example above? What happens to curve when emphasizing one or more data points?

        Solution

        Add example text here.

    3. Standard error of the fit \(S_{yx}\)
      1. Quantify deviation between \(y_{i}\) and \(\hat{y}(x)\) \[ S_{yx} = \frac{\sqrt{\sum_{i=1}^{N}(y_{i}-\hat{y}(x_{i}))^{2}}}{\nu} \] where \(\nu = N-(m+1)\) (number of data points subtracted by order of the polynomial fit
        1. Gives ``goodness of fit" like the \(R\) value
        2. Higher \(m\) will often make \(S_{yx}\) decrease
        3. Underlying physical principle should define the \(m^{\mathrm{th}}\) order of fit; otherwise it adds unnecessary inflection points in \(\hat{y}(x)\)


    5.1: Fitting with Weighted Least Squares is shared under a CC BY-NC 4.0 license and was authored, remixed, and/or curated by LibreTexts.

    • Was this article helpful?