# 11.2: Various Approaches to Developing Special Methods

There are two methods which use a complex FFT in a special way to increase efficiency. The first method uses a length-N complex FFT to compute two length-N real FFTs by putting the two real data sequences into the real and the imaginary parts of the input to a complex FFT. Because transforms of real data have even real parts and odd imaginary parts, it is possible to separate the transforms of the two inputs with 2N-4 extra additions. This method requires, however, that two inputs be available at the same time.

The second method uses the fact that the last stage of a decimation-in-time radix-2 FFT combines two independent transforms of length N/2 to compute a length-N transform. If the data are real, the two half length transforms are calculated by the method described above and the last stage is carried out to calculate the total length-N FFT of the real data. It should be noted that the half-length FFT does not have to be calculated by a radix-2 FFT. In fact, it should be calculated by the most efficient complex-data algorithm possible, such as the SRFFT or the PFA. The separation of the two half-length transforms and the computation of the last stage requires \(N-6\) real multiplications and \((5/2)N-6\) real additions.

It is possible to derive more efficient real-data algorithms directly rather than using a complex FFT. The basic idea is from Bergland and Sande which, at each stage, uses the symmetries of a constant radix Cooley-Tukey FFT to minimize arithmetic and storage. In the usual derivation of the radix-2 FFT, the length-N transform is written as the combination of the length-N/2 DFT of the even indexed data and the length-N/2 DFT of the odd indexed data. If the input to each half-length DFT is real, the output will have Hermitian symmetry. Hence the output of each stage can be arranged so that the results of that stage stores the complex DFT with the real part located where half of the DFT would have gone, and the imaginary part located where the conjugate would have gone. This removes most of the redundant calculations and storage but slightly complicates the addressing. The resulting butterfly structure for this algorithm resembles that for the fast Hartley transform. The complete algorithm has one half the number of multiplications and N-2 fewer than half the additions of the basic complex FFT. Applying this approach to the split-radix FFT gives a particularly interesting algorithm.

Special versions of both the PFA and WFTA can also be developed for real data. Because the operations in the stages of the PFA can be commuted, it is possible to move the combination of the transform of the real part of the input and imaginary part to the last stage. Because the imaginary part of the input is zero, half of the algorithm is simply omitted. This results in the number of multiplications required for the real transform being exactly half of that required for complex data and the number of additions being about N less than half that required for the complex case because adding a pure real number to a pure imaginary number does not require an actual addition. Unfortunately, the indexing and data transfer becomes somewhat more complicated. A similar approach can be taken with the WFTA.

### Contributor

- ContribEEBurrus