Skip to main content
Engineering LibreTexts

15.6: Cauchy-Schwarz Inequality

  • Page ID
    22940
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    Introduction

    Any treatment of linear algebra as relates to signal processing would not be complete without a discussion of the Cauchy-Schwarz inequality, a relation that enables a wide array of signal processing applications related to pattern matching through a method called the matched filter. Recall that in standard Euclidean space, the angle \(\theta\) between two vectors \(x,y\) is given by

    \[\cos (\theta)=\frac{\langle x, y\rangle}{\|x\|\|y\|}. \nonumber \]

    Since \(\cos (\theta) \leq 1\), it follows that

    \[|\langle x, y\rangle|^{2} \leq\langle x, x\rangle\langle y, y\rangle. \nonumber \]

    Furthermore, equality holds if and only if \(\cos(\theta)=0\), implying that

    \[|\langle x, y\rangle|^{2}=\langle x, x\rangle\langle y, y\rangle \nonumber \]

    if and only if \(y=ax\) for some real \(a\). This relation can be extended to all inner product spaces over a real or complex field and is known as the Cauchy-Schwarz inequality, which is of great importance to the study of signals.

    The Cauchy-Schwarz Inequality

    Statement of the Cauchy-Schwarz Inequality

    The general statement of the Cauchy-Schwarz inequality mirrors the intuition for standard Euclidean space. Let \(V\) be an inner product space over the field of complex numbers \(\mathbb{C}\) with inner product \(\langle\cdot, \cdot\rangle\). For every pair of vectors \(x, y \in V\) the inequality

    \[|\langle x, y\rangle|^{2} \leq\langle x, x\rangle\langle y, y\rangle \nonumber \]

    holds. Furthermore, the equality

    \[|\langle x, y\rangle|^{2}=\langle x, x\rangle\langle y, y\rangle \nonumber \]

    holds if and only if \(y=ax\) for some \(a \in \mathbb{C}\). That is, equality holds if and only if \(x\) and \(y\) are linearly dependent.

    Proof of the Cauchy-Schwarz Inequality

    Let \(V\) be a vector space over the real or complex field \(F\), and let \(x,y \in V\) be given. In order to prove the Cauchy-Schwarz inequality, it will first be proven that \(|\langle x, y\rangle|^{2}=\langle x, x\rangle\langle y, y\rangle\) if \(y=ax\) for some \(a \in F\). It will then be shown that \(|\langle x, y\rangle|^{2}<\langle x, x\rangle\langle y, y\rangle\) if \(y \neq a x\) for all \(a \in F\).

    Consider the case in which \(y=ax\) for some \(a \in F\). From the properties of inner products, it is clear that

    \[\begin{align}
    |\langle x, y\rangle|^{2} &=|\langle x, a x\rangle|^{2} \nonumber \\
    &=|\bar{a}\langle x, x\rangle|^{2}
    \end{align}. \nonumber \]

    Hence, it follows that

    \[\begin{align}
    |\langle x, y\rangle|^{2} &=|\bar{a}|^{2}|\langle x, x\rangle|^{2} \nonumber \\
    &=|a|^{2}\langle x, x\rangle^{2}
    \end{align}. \nonumber \]

    Similarly, it is clear that

    \[\begin{align}
    \langle x, x\rangle\langle y, y\rangle &=\langle x, x\rangle\langle a x, a x\rangle \nonumber \\
    &=\langle x, x\rangle a \bar{a}\langle x, x\rangle \nonumber \\
    &=|a|^{2}\langle x, x\rangle^{2}
    \end{align} \nonumber \]

    Thus, it is proven that \(|\langle x, y\rangle|^{2}=\langle x, x\rangle\langle y, y\rangle\) if \(x=ay\) for some \(a \in F\).

    Next, consider the case in which \(y \neq a x\) for all \(a \in F\), which implies that \(y \neq 0\) so \(\langle y, y\rangle \neq 0\). Thus, it follows by the properties of inner products that, for all \(a \in F\), \(\langle x-a y, x-a y\rangle>0\). This can be expanded using the properties of inner products to the expression

    \[\begin{align}
    \langle x-a y, x-a y\rangle &=\langle x, x-a y\rangle-a\langle y, x-a y\rangle \nonumber \\
    &=\langle x, x\rangle-\bar{a}\langle x, y\rangle-a\langle y, x\rangle+|a|^{2}\langle y, y\rangle
    \end{align} \nonumber \]

    Choosing \(a=\frac{\langle x, y\rangle}{\langle y, y\rangle}\),

    \[\begin{align}
    \langle x-a y, x-a y\rangle &=\langle x, x\rangle-\frac{\langle y, x\rangle}{\langle y, y\rangle}\langle x, y\rangle-\frac{\langle x, y\rangle}{\langle y, y\rangle}\langle y, x\rangle+\frac{\langle x, y\rangle\langle y, x\rangle}{\langle y, y\rangle^{2}}\langle y, y\rangle \nonumber \\
    &=\langle x, x\rangle-\frac{\langle x, y\rangle\langle y, x\rangle}{\langle y, y\rangle}
    \end{align} \nonumber \]

    Hence, it follows that \(\langle x, x\rangle-\frac{\langle x, y\rangle\langle y, x\rangle}{\langle y, y\rangle}>0\). Consequently, \(\langle x, x\rangle\langle y, y\rangle-\langle x, y\rangle \overline{\langle x, y}\rangle>0\). Thus, it can be concluded that \(|\langle x, y\rangle|^{2}<\langle x, x\rangle\langle y, y\rangle\) if \(y \neq a x\) for all \(a \in F\).

    Therefore, the inequality

    \[|\langle x, y\rangle|^{2} \leq\langle x, x\rangle\langle y, y\rangle \nonumber \]

    holds for all \(x,y \in V\), and equality

    \[|\langle x, y\rangle|^{2}=\langle x, x\rangle\langle y, y\rangle \nonumber \]

    holds if and only if \(y=ax\) for some \(a \in F\).

    Additional Mathematical Implications

    Consider the maximization of \(\left|\left\langle\frac{x}{\| x||}, \frac{y}{\|y\|}\right\rangle\right|\) where the norm \(\|\cdot\|=\langle\cdot , \cdot\rangle\) is induced by the inner product. By the Cauchy-Schwarz inequality, we know that \(\left|\left\langle\frac{x}{|| x||}, \frac{y}{\|y\|}\right\rangle\right|^{2} \leq 1\) and that \(\left|\left\langle\frac{x}{|| x||}, \frac{y}{\|y\|}\right\rangle\right|^{2}=1\) if and only if \(\frac{y}{\|y\|}=a \frac{x}{\|x\|}\) for some \(a \in \mathbb{C}\). Hence, \(\left|\left\langle\frac{x}{\| x||}, \frac{y}{\|y\|}\right\rangle\right|\) attains a maximum where \(\frac{y}{\|y\|}=a \frac{x}{\|x\|}\) for some \(a \in \mathbb{C}\). Thus, collecting the scalar variables, \(\left|\left\langle\frac{x}{\| x||}, \frac{y}{\|y\|}\right\rangle\right|\) attains a maximum where \(y=ax\). This result will be particularly useful in developing the matched filter detector techniques.

    Matched Filter Detector

    Background Concepts

    A great many applications in signal processing, image processing, and beyond involve determining the presence and location of a target signal within some other signal. A radar system, for example, searches for copies of a transmitted radar pulse in order to determine the presence of and distance to reflective objects such as building or aircraft. A communication system searches for copies of waveforms representing digital 0s and 1s in order to receive a message.

    As has already been shown, the expression \(\left|\left\langle\frac{x}{\|x\|}, \frac{y}{\|y\|}\right\rangle\right|\) attains its upper bound, which is 1, when \(y=ax\) for some scalar \(a\) in a real or complex field. The lower bound, which is 0, is attained when \(x\) and \(y\) are orthogonal. In informal intuition, this means that the expression is maximized when the vectors \(x\) and \(y\) have the same shape or pattern and minimized when \(x\) and \(y\) are very different. A pair of vectors with similar but unequal shapes or patterns will produce relatively large value of the expression less than 1, and a pair of vectors with very different but not orthogonal shapes or patterns will produce relatively small values of the expression greater than 0. Thus, the above expression carries with it a notion of the degree to which two signals are “alike”, the magnitude of the normalized correlation between the signals in the case of the standard inner products.

    This concept can be extremely useful. For instance consider a situation in which we wish to determine which signal, if any, from a set \(X\) of signals most resembles a particular signal \(y\). In order to accomplish this, we might evaluate the above expression for every signal \(x \in X\), choosing the one that results in maxima provided that those maxima are above some threshold of “likeness”. This is the idea behind the matched filter detector, which compares a set of signals against a target signal using the above expression in order to determine which among them are most like the target signal. For a detailed treatment of the applications of the matched filter detector see the liked module.

    Signal Comparison

    The simplest variant of the matched filter detector scheme would be to find the member signal in a set \(X\) of signals that most closely matches a target signal \(y\). Thus, for every \(x \in X\) we wish to evaluate

    \[m(x, y)=\left|\left\langle\frac{x}{\|x\|}, \frac{y}{\|y\|}\right\rangle\right| \nonumber \]

    in order to compare every member of \(X\) to the target signal \(y\). Since the member of \(X\) which most closely matches the target signal \(y\) is desired, ultimately we wish to evaluate

    \[x_{m}=\operatorname{argmax}_{x \in X}\left|\left\langle\frac{x}{\|x\|}, \frac{y}{\|y\|}\right\rangle\right|. \nonumber \]

    Note that the target signal does not technically need to be normalized to produce a maximum, but gives the desirable property that \(m(x,y)\) is bounded to \([0,1]\).

    The element \(x_m \in X\) that produces the maximum value of \(m(x,y)\) is not necessarily unique, so there may be more than one matching signal in \(X\). Additionally, the signal \(x_m \in X\) producing the maximum value of \(m(x,y)\) may not produce a very large value of \(m(x,y)\) and thus not be very much like the target signal \(y\). Hence, another matched filter scheme might identify the argument producing the maximum but only above a certain threshold, returning no matching signals in \(X\) if the maximum is below the threshold. There also may be a signal \(x \in X\) that produces a large value of \(m(x,y)\) and thus has a high degree of “likeness” to yy but does not produce the maximum value of \(m(x,y)\). Thus, yet another matched filter scheme might identify all signals in \(X\) producing local maxima that are above a certain threshold.

    Example \(\PageIndex{1}\)

    For example, consider the target signal given in Figure \(\PageIndex{1}\) and the set of two signals given in Figure \(\PageIndex{2}\). By inspection, it is clear that the signal \(g_2\) is most like the target signal \(f\). However, to make that conclusion mathematically, we use the matched filter detector with the \(L_2\) inner product. If we were to actually make the necessary computations, we would first normalize each signal and then compute the necessary inner products in order to compare the signals in \(X\) with the target signal \(f\). We would notice that the absolute value of the inner product for \(g_2\) with \(f\) when normalized is greater than the absolute value of the inner product of \(g_1\) with \(f\) when normalized, mathematically stated as

    \[g_{2}=\operatorname{argmax}_{x \in\left\{g_{1} , g_{2}\right\}}\left|\left\langle\frac{x}{\| x||}, \frac{f}{\|f\|}\right\rangle\right| \nonumber \]

    Template Signal csi_f1.png
    Figure \(\PageIndex{1}\): We wish to find a match for this target signal in the set of signals below.

    Candidate Signals

    csi_f2.png
    csi_f3.png
    Figure \(\PageIndex{2}\): We wish to find a match for the above target signal in this set of signals.

    Pattern Detection

    A somewhat more involved matched filter detector scheme would involve attempting to match a target time limited signal \(y=f\) to a set of of time shifted and windowed versions of a single signal \(X=\left\{w S_{t} g \mid t \in \mathbb{R}\right\}\) indexed by \(\mathbb{R}\). The windowing function is given by \(w(t)=u(t−t_1)−u(t−t_2)\) where \([t_1,t_2]\) is the interval to which \(f\) is time limited. This scheme could be used to find portions of \(g\) that have the same shape as \(f\). If the absolute value of the inner product of the normalized versions of \(f\) and \(wS_t g\) is large, which is the absolute value of the normalized correlation for standard inner products, then gg has a high degree of “likeness” to \(f\) on the interval to which \(f\) is time limited but left shifted by \(t\). Of course, if \(f\) is not time limited, it means that the entire signal has a high degree of “likeness” to \(f\) left shifted by \(t\).

    Thus, in order to determine the most likely locations of a signal with the same shape as the target signal \(f\) in a signal \(g\) we wish to compute

    \[t_{m}=\operatorname{argmax}_{t \in \mathbb{R}}\left|\left\langle\frac{f}{\|f\|}, \frac{w S_{t} g}{\left\|w S_{t} g\right\|}\right\rangle\right| \nonumber \]

    to provide the desired shift. Assuming the inner product space examined is \(L_2\)(\(\mathbb{R}\) (similar results hold for \(L_2 (\mathbb{R}[a,b)\)), \(l_2(\mathbb{Z})\), and \(l_2(\mathbb{Z}[a,b))\)), this produces

    \[t_{m}=\operatorname{argmax}_{t \in \mathbb{R}} \mid \frac{1}{\|f\|\left\|w S_{t} g\right\|} \int_{-\infty}^{\infty} f(\tau) w(\tau) \overline{g(\tau-t)} d \tau \nonumber \]

    Since \(f\) and \(w\) are time limited to the same interval

    \[t_{m}=\operatorname{argmax}_{t \in \mathbb{R}}\left|\frac{1}{\|f\|\left\|w S_{t} g\right\|} \int_{t_{1}}^{t_{2}} f(\tau) \overline{g(\tau-t)} d \tau\right| \nonumber \]

    Making the substitution \(h(t)=\overline{g(-t)}\),

    \[t_{m}=\operatorname{argmax}_{t \in \mathbb{R}}\left|\frac{1}{\|f\|\left\|w S_{t} g\right\|} \int_{t_{1}}^{t_{2}} f(\tau) h(t-\tau) d \tau\right| \nonumber \]

    Noting that this expression contains a convolution operation

    \[t_{m}=\operatorname{argmax}_{t \in \mathbb{R}}\left|\frac{(f * h)(t)}{\|f\|\left\|w S_{t} g\right\|}\right|. \nonumber \]

    where \(h\) is the conjugate of the time reversed version of \(g\) defined by \(h(t)=\overline{g(-t)}\).

    In the special case in which the target signal \(f\) is not time limited, \(w\) has unit value on the entire real line. Thus, the norm can be evaluated as \(\left\|w S_{t} g\right\|=\left\|S_{t} g\right\|=\|g\|=\|h\|\). Therefore, the function reduces to \(t_{m}=\operatorname{argmax}_{t \in \mathbb{R}} \frac{\left(f^{*} h\right)(t)}{\|f\|\|h\|}\) where \(h(t)=\overline{g(-t)}\). The function \(f \text { ☆ } g=\frac{\left(f^{*} h\right)(t)}{\|f\|\|h\|}\) is known as the normalized cross-correlation of \(f\) and \(g\).

    Hence, this matched filter scheme can be implemented as a convolution. Therefore, it may be expedient to implement it in the frequency domain. Similar results hold for the \(L_2(\mathbb{R}[a,b))\), \(l_2(\mathbb{Z})\), and \(l_2(\mathbb{Z}[a,b])\) spaces. It is especially useful to implement the \(l_2(\mathbb{Z}[a,b])\) cases in the frequency domain as the power of the Fast Fourier Transform algorithm can be leveraged to quickly perform the computations in a computer program. In the \(L_2(\mathbb{R}[a,b))\) and \(l_2(\mathbb{Z}[a,b])\) cases, care must be taken to zero pad the signal if wrap-around effects are not desired. Similar results also hold for spaces on higher dimensional intervals with the same inner products.

    Of course, there is not necessarily exactly one instance of a target signal in a given signal. There could be one instance, more than one instance, or no instance of a target signal. Therefore, it is often more practical to identify all shifts corresponding to local maxima that are above a certain threshold.

    Example \(\PageIndex{2}\)

    The signal in Figure \(\PageIndex{4}\) contains an instance of the template signal seen in Figure \(\PageIndex{3}\) beginning at time \(t=s_1\) as shown by the plot in Figure \(\PageIndex{5}\). Therefore,

    \[s_{1}=\operatorname{argmax}_{t \in \mathbb{R}}\left|\left\langle\frac{f}{\|f\|}, \frac{w S_{t} g}{\left\|w S_{t} g\right\|}\right\rangle\right| \nonumber \]

    Pattern Signal csi_pattern.png
    Figure \(\PageIndex{3}\): This function shows tha pattern we are looking for in the signal below, which occurs at time \(t=s_1\).
    Longer Signal csi_long.png
    Figure \(\PageIndex{4}\): This signal contains an instance of the above signal starting at time \(t=s_1\).
    Absolute Value of Output mfout2.png
    Figure \(\PageIndex{5}\): This signal shows a sketch of the absolute value of the matched filter output for the interval shown. Note that this was just an "eyeball approximation" sketch. Observe the pronounced peak at time \(t=s_1\).

    Cauchy-Schwarz Inequality Video Lecture

    Proof of the Cauchy-Schwarz Inequality
    Figure \(\PageIndex{6}\): Video lecture on the proof of the Cauchy-Schwarz inequality from Khan Academy. Only part of the theorem is proven.

    Cauchy-Schwarz Inequality Summary

    As can be seen, the Cauchy-Schwarz inequality is a property of inner product spaces over real or complex fields that is of particular importance to the study of signals. Specifically, the implication that the absolute value of an inner product is maximized over normal vectors when the two arguments are linearly dependent is key to the justification of the matched filter detector. Thus, it enables the use of matched filters for such pattern matching applications as image detection, communications demodulation, and radar signal analysis.


    This page titled 15.6: Cauchy-Schwarz Inequality is shared under a CC BY license and was authored, remixed, and/or curated by Richard Baraniuk et al..

    • Was this article helpful?