In this section the reductions in arithmetic in the DFT that result from the index mapping alone will be examined. In practical algorithms several methods are always combined, but it is helpful in understanding the effects of a particular method to study it alone.
The most general form of an uncoupled two-dimensional DFT is given by
\[X(k_{1},k_{2})=\sum_{n_{2}=0}^{N_{2}-1}\left \{ \sum_{n_{1}=0}^{N_{1}-1} x(n_{1},n_{2})f_{1}(n_{1},n_{2},k_{1})\right \}f_{2}(n_{2},k_{1},k_{2}) \nonumber \]
where the inner sum calculates \(N_2\) length-\(N_1\) DFT's and, if for a type-two map, the effects of the TFs. If the number of arithmetic operations for a length-\(N\) DFT is denoted by \(F(N)\), the number of operations for this inner sum is
F=N2F(N1)F=N2F(N1)" role="presentation" style="position:relative;" tabindex="0">\[F=N_{2}F(N_{1}) \nonumber \]
The outer sum which gives \(N_1\) length-\(N_2\) DFT's requires \(N_1F(N_2)\) operations. The total number of arithmetic operations is then
\[F=N_{2}F(N_{1})+N_{1}F(N_{2}) \nonumber \]
The first question to be considered is for a fixed length \(N\), what is the optimal relation of \(N_1\) and \(N_2\) in the sense of minimizing the required amount of arithmetic. To answer this question, \(N_1\) and \(N_2\) are temporarily assumed to be real variables rather than integers. If the short length- \(N_i\) DFT's in the equation and any TF multiplications are assumed to require \(N_i^2\) operations, i.e
\[F(N_{i})=N_{i}^{2} \nonumber \]
Thus Efficiencies Resulting from Index Mapping with the DFT becomes
\[F=N_{2}N_{1}^{2}+N_{1}N_{2}^{2}=N(N_{1}+N_{2})=N(N_{1}+NN_{1}^{-1}) \nonumber \]
To find the minimum of \(F\) over \(N_1\), the derivative of \(F\) with respect to \(N_1\) is set to zero (temporarily assuming the variables to be continuous) and the result requires \(N_{1}=N_{2}\)
\[\frac{\mathrm{d} F}{\mathrm{d} N_{1}}=0\; \; \; \; \Rightarrow N_{1}=N_{2} \nonumber \]
This result is also easily seen from the symmetry of \(N_1\), and in \(N_2\), in \(N=N_1N_2\). If a more general model of the arithmetic complexity of the short DFT's is used, the same result is obtained, but a closer examination must be made to assure that \(N_{1}=N_{2}\) is a global minimum.
N1=N2N1=N2" role="presentation" style="position:relative;" tabindex="0">If only the effects of the index mapping are to be considered, then the \(F(N)=N^2\) model is used and the equation states that the two factors should be equal. If there are \(M\) factors, a similar reasoning shows that all \(M\) factors should be equal. For the sequence of length
F(N)=N2F(N)=N2" role="presentation" style="position:relative;" tabindex="0">\[N=R^{M} \nonumber \]
there are now \(M\) length-\(R\) DFT's and, since the factors are all equal, the index map must be type two. This means there must be twiddle factors.
In order to simplify the analysis, only the number of multiplications will be considered. If the number of multiplications for a length-\(R\) DFT is \(F(R)\), then the formula for operation counts in the equation generalizes to
\[F=N\sum_{i=1}^{M}\frac{F(N_{i})}{N_{i}}=NM\frac{F(R)}{R} \nonumber \]
for \(N_i=R\)
\[F=N\ln R(N)\frac{F(R)}{R}=(N\ln N)\frac{F(R)}{R\ln R} \nonumber \]
This is a very important formula which was derived by Cooley and Tukey in their famous paper on the FFT. It states that for a given \(R\) which is called the radix, the number of multiplications (and additions) is proportional to \(N\ln N\). It also shows the relation to the value of the radix, \(R\).
In order to get some idea of the “best" radix, the number of multiplications to compute a length-\(R\) DFT is assumed to be
\[F(R)=R^{x} \nonumber \]
If this is used with the equation, the optimal \(R\) can be found.
\[\frac{\mathrm{d} F}{\mathrm{d} R}=0\; \; \; \; \Rightarrow R=e^{\frac{1}{(x-1)}} \nonumber \]
For \(x=2\) this gives \(R=e\), with the closest integer being three.
The result of this analysis states that if no other arithmetic saving methods other than index mapping are used, and if the length-\(R\) DFT's plus TFs requireF=R2F=R2" role="presentation" style="position:relative;" tabindex="0"> \(F=R^{2}\) multiplications, the optimal algorithm requires \(F=3N\log_{3}N\) multiplications for a length \(N=3^{M}\) DFT. Compare this with \(N^2\) for a direct calculation and the improvement is obvious.
While this is an interesting result from the analysis of the effects of index mapping alone, in practice, index mapping is almost always used in conjunction with special algorithms for the short length-\(N_i\) DFT's in the equation. For example, if \(R=2\; \text{or}\; 4\), there are no multiplications required for the short DFT's. Only the TFs require multiplications. Winograd (see Winograd's Short DFT Algorithms ) has derived some algorithms for short DFT's that require O(N) multiplications. This means that F(Ni)=KNiF(Ni)=KNi" role="presentation" style="position:relative;" tabindex="0">
\[F(N_{i})=KN_{i} \nonumber \]
and the operation count \(F\) in "Efficiencies Resulting from Index Mapping with the DFT" is independent of \(N_i\). Therefore, the derivative of \(F\) is zero for all \(N_i\). Obviously, these particular cases must be examined.