4.4: Goertzel's Algorithm or A Better DFT Algorithm

Last updated
Save as PDF

Page ID: 1982

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Goertzel's algorithm is another methods that calculates the DFT by converting it into a digital filtering problem. The method looks at the calculation of the DFT as the evaluation of a polynomial on the unit circle in the complex plane. This evaluation is done by Horner's method which is implemented recursively by an IIR filter.

The First-Order Goertzel Algorithm

The polynomial whose values on the unit circle are the DFT is a slightly modified \(\mathit{z}\)-transform of \(x(n)\) given by

\[X(z)=\sum_{n=0}^{N-1}x(n)z^{-n} \nonumber \]

which for clarity in this development uses a positive exponent . This is illustrated for a length-4 sequence as a third-order polynomial by

\[X(z)=x(3)z^{3}+x(2)z^{2}+x(1)z+x(0) \nonumber \]

The DFT is found by evaluating the equation at \(z=W^{k}\) which can be written as

\[C(k)=X(z)\mid _{z=W^{k}}=DFT\left \{ x(n) \right \}\; \; \; where\; \; \; W=e^{-j2\pi /N} \nonumber \]

The most efficient way of evaluating a general polynomial without any pre-processing is by “Horner's rule" which is a nested evaluation. This is illustrated for the polynomial in the equation below:

\[X(z)=\left \{ [x(3)z+x(2)]z+x(1)z\right \}+x(0) \nonumber \]

This nested sequence of operations can be written as a linear difference equation in the form of

\[y(m)=zy(m-1)+x(N-m) \nonumber \]

with initial condition \(y(0)=0\), and the desired result being the solution at m=N. The value of the polynomial is given by

\[X(z)=y(N) \nonumber \]

This equation can be viewed as a first-order IIR filter with the input being the data sequence in reverse order and the value of the polynomial at \(\mathit{z}\) being the filter output sampled at \(m=N\). Applying this to the DFT gives the Goertzel algorithm:

\[y(m)=W^{k}y(m-1)+x(N-m) \nonumber \]

with

\[y(0)=0\; \; \; and\; \; \; C(k)=y(n) \nonumber \]

where

\[C(k)=\sum_{n=0}^{N-1}x(n)W^{nk} \nonumber \]

When comparing this program with the direct calculation of equation, it is seen that the number of floating-point multiplications and additions are the same. In fact, the structures of the two algorithms look similar, but close examination shows that the way the sines and cosines enter the calculations is different. In the equation, new sine and cosine values are calculated for each frequency and for each data value, while for the Goertzel algorithm in the equation, they are calculated only for each frequency in the outer loop. Because of the recursive or feedback nature of the algorithm, the sine and cosine values are “updated" each loop rather than recalculated. This results in \(2N\) trigonometric evaluations rather than \(2N^2\). It also results in an increase in accumulated quantization error.

It is possible to modify this algorithm to allow entering the data in forward order rather than reverse order. The difference in the equation becomes

\[y(m)=z^{-1}y(m-1)+x(m-1) \nonumber \]

if the equation becomes

\[C(k)=z^{N-1}y(N) \nonumber \]

for \(y(0)=0\). This is the algorithm programmed later.

The Second-Order Goertzel Algorithm

One of the reasons the first-order Goertzel algorithm does not improve efficiency is that the constant in the feedback or recursive path is complex and, therefore, requires four real multiplications and two real additions. A modification of the scheme to make it second-order removes the complex multiplications and reduces the number of required multiplications by two.

Define the variable \(q(m)\) so that

\[y(m)=q(m)-z^{-1}q(m-1) \nonumber \]

This substituted into the right-hand side of the above equation gives

\[y(m)=zq(m-1)-q(m-2)+x(N-m) \nonumber \]

Combining the two equations gives the second order difference equation

\[q(m)=(z+z^{-1})q(m-1)-q(m-2)+x(N-m) \nonumber \]

which together with the output equation, comprise the second-order Goertzel algorithm where

\[X(z)=Y(n) \nonumber \]

for initial conditions

\[q(0)=q(-1)=0 \nonumber \]

A similar development starting with the equation gives a second-order algorithm with forward ordered input as

\[q(m)=(z+z^{-1})q(m-1)-q(m-2)+x(m-1) \nonumber \]

\[y(m)=q(m)-zq(-1) \nonumber \]

with

\[X(z)=z^{N-1}y(N)\; \; for\; \; q(0)=q(-1)=0 \nonumber \]

Note that both the equations are not changed if \(\mathit{z}\) is replaced with \(\mathit{z}^{-1}\), only the output Equation and Equation are different. This means that the polynomial \(X(z)\) may be evaluated at a particular \(\mathit{z}\) and its inverse \(\mathit{z}^{-1}\) from one solution of the two equations using the output equations

\[X(z)=q(N)-z^{-1}q(N-1) \nonumber \]

and

\[X(1/z)=z^{N-1}\left ( q(N)-zq(N-1)\right ) \nonumber \]

Clearly, this allows the DFT of a sequence to be calculated with half the arithmetic since the outputs are calculated two at a time. The second-order DE actually produces a solution \(q(m)\) that contains two first-order components. The output equations are, in effect, zeros that cancel one or the other pole of the second-order solution to give the desired first-order solution. In addition to allowing the calculating of two outputs at a time, the second-order DE requires half the number of real multiplications as the first-order form. This is because the coefficient of the \(q(m-2)\) is unity and the coefficient of the \(q(m-1)\) is real if \(\mathit{z}\) and \(\mathit{z}^{-1}\) are complex conjugates of each other which is true for the DFT.

Analysis of Arithmetic Complexity and Timings

Analysis of the various forms of the Goertzel algorithm from their programs gives the following operation count for real multiplications and real additions assuming real data.

Algorithm	Real Mults.	Real Adds	Trig Eval.
Direct DFT	4N²	4N²	2N²
First-Order	4N²	4N²- 2N	2N
Second-Order	2N² + 2N	4N²	2N
Second-Order 2	N² + N	2N² + N	N

Timings of the algorithms on a PC in milliseconds are given in the following table.

Algorithm	N=125	N=257
Direct DFT	4.90	19.83
First-Order	4.01	16.70
Second-Order	2.64	11.04
Second-Order 2	1.32	5.55

These timings track the floating point operation counts fairly well.

Conclusions

Goertzel's algorithm in its first-order form is not particularly interesting, but the two-at-a-time second-order form is significantly faster than a direct DFT. It can also be used for any polynomial evaluation or for the DTFT at unequally spaced values or for evaluating a few DFT terms. A very interesting observation is that the inner-most loop of the Glassman-Ferguson FFT is a first-order Goertzel algorithm even though that FFT is developed in a very different framework.

In addition to floating-point arithmetic counts, the number of trigonometric function evaluations that must be made or the size of a table to store precomputed values should be considered. Since the value of the \(W^{nk}\) terms in the equation are iteratively calculate in the IIR filter structure, there is round-off error accumulation that should be analyzed in any application.

It may be possible to further improve the efficiency of the second-order Goertzel algorithm for calculating all of the DFT of a number sequence. Perhaps a fourth order DE could calculate four output values at a time and they could be separated by a numerator that would cancel three of the zeros. Perhaps the algorithm could be arranged in stages to give an \(N\log N\) operation count. The current algorithm does not take into account any of the symmetries of the input index.

Contributor

ContribEEBurrus