Course Outline
Course Outline
INTRODUCTION TO THE
FAST FOURIER TRANSFORM
ALGORITHM
Introduction
Fast Fourier Transforms have revolutionized digital signal
processing
What is the FFT?
y A collection of “tricks” that exploit the symmetry of the DFT
calculation to make its execution much faster
y Speedup increases with DFT size
Introduction, continued
Some dates:
y 1880 - algorithm first described by Gauss
y 1965 - algorithm rediscovered (not for the first time)
by Cooley and Tukey
In 1967, calculation of a 8192-point DFT on the
top-of-the line IBM 7094 took ….
y 30 minutes using conventional techniques
y 5 seconds using FFTs
Measures of computational efficiency
Could consider
y Number of additions
y Number of multiplications
y Amount of memory required
y Scalability and regularity
For the present discussion we’ll focus most on
number of multiplications as a measure of
computational complexity
y More costly than additions for fixed-point processors
y Same cost as additions for floating-point processors,
but number of operations is comparable
Computational Cost of Discrete-Time Filtering
Computational Cost
Convolution of an of Discrete-Time
N-point Filtering
input with an M-point unit
sample response ….
Direct convolution:
∞
y[n] = ∑ x[k]h[n − k]
k=−∞
y Number of multiplies ≈ MN
Computational Cost of Discrete-Time Filtering
Convolution of an N-point input with an M-point
Computational Cost of Discrete-Time Filtering
unit sample response ….
Using transforms directly:
N −1
− j2πkn / N
X[k] = ∑ x[n]e
n=0 2
y Computation of N-point DFTs requires N multiplys
y Each convolution requires three DFTs of length N+M-1 plus
an additional N+M-1 complex multiplys or
3(N + M − 1)2 + (N + M − 1)
y For N >> M , for example, the computation is O(N 2 )
The Cooley-Tukey decimation-in-time algorithm
• The Cooley-Tukey
Consider decimation-in-time
the DFT algorithm for an integeralgorithm
power of 2,
N −1 N −1
X[k] = ∑ x[n]WN nk = ∑ x[n]e − j 2πnk / N ; WN = e − j2π / N
k =0 k =0
• Create separate sums for even and odd values of n:
X[k] = ∑ x[n]WN 2k + ∑ x[n]WN 2k
n even n odd
• Letting n = 2r for n even and n = 2r + 1 for n odd, we
obtain
( N / 2)−1 ( N / 2 )−1
X[k] = ∑ x[2r]WN 2rk + ∑ x[2r + 1]WN ( 2r+1) k
r =0 r =0
The Cooley-Tukey decimation in time algorithm
Splitting indices in time, we have obtained
( N / 2)−1 ( N / 2 )−1
X[k] = ∑ x[2r]WN 2rk + ∑ x[2r + 1]WN ( 2r+1) k
r =0 r =0
But WN2 = e − j 2π 2 / N = e − j2π /( N / 2) = WN / 2 and WN
2rk k
WN = WNk WNrk/ 2
So …
(N/ 2)−1 ( N/ 2)−1
X[k] = ∑ x[2r]WNrk/ 2 + WNk ∑ x[2r + 1]WNrk/ 2
n=0 n=0
N/2-point DFT of x[2r] N/2-point DFT of
x[2r+1]
Savings so far …
We have split the DFT computation into two halves:
Savings so far …
N −1
X[k] = ∑ x[n]WN nk
k =0
( N/ 2)−1 ( N/ 2)−1
= ∑ x[2r]WNrk/ 2 + WNk ∑ x[2r + 1]WNrk/ 2
n=0 n=0
Have we gained anything? Consider the nominal
number of multiplications for N = 8
y Original form produces 8 2 = 64 multiplications
y New form produces 2(4 2 ) + 8 = 40 multiplications
y So we’re already ahead ….. Let’s keep going!!
Signal flowgraph representation of 8-point DFT
Recall that the DFT is now of the form
Signal flowgraph
The DFT in (partial) representation of 8-point DFT
flowgraph notation:
The complete decomposition into 2-point DFTs
The complete decomposition into 2-point DFTs
Now let’s take a closer look at the 2-point DFT
The expression for the 2-point DFT is:
1 1
X[k] = ∑ x[n]W2nk = ∑ x[n]e − j 2πnk / 2
n=0 n=0
Evaluating for k = 0,1 we obtain
X[0] = x[0] + x[1]
X[1] = x[0] + e − j 2π1 / 2 x[1] = x[0] − x[1]
which in signal flowgraph notation looks like ...
This topology is referred to as the
basic butterfly
The complete 8-point decimation-in-time FFT
Number of multiplys for N-point FFTs
• Let N = 2ν where ν = log 2 (N)
• (log2(N) columns)(N/2 butterflys/column)(2 mults/butterfly)
or ~ N log 2 (N) multiplys
Computational complexity
Additional timesavers: reducing multiplications in the basic butterfly
As we derived it, the basic butterfly is of the form
WNr
WNr+ N / 2
N / 2 = −1
Since WN we can reduce computation by 2 by
r
premultiplying by WN
WNr −1
Bit reversal of the input
Recall the first stages of the 8-point FFT:
Consider the binary representation of the
indices of the input:
0 000 If these binary indices are
4 100 time reversed, we get the
2 010 binary sequence representing
0,1,2,3,4,5,6,7
6 110
1 001 Hence the indices of the FFT
5 101 inputs are said to be in
3 011 bit-reversed order
7 111
Some comments on bit reversal
In the implementation of the FFT that we discussed,
the input is bit reversed and the output is developed
in natural order
Some other implementations of the FFT have the input
in natural order and the output bit reversed
In some situations it is convenient to implement
filtering applications by
y Use FFTs with input in natural order, output in bit-reversed
order
y Multiply frequency coefficients together (in bit-reversed order)
y Use inverse FFTs with input in bit-reversed order, output in
natural order
Computing in this fashion means we never have to
compute bit reversal explicitly
Decimation in Freq FFT
Alternate FFT structures
We developed the basic decimation-in-time
(DIT) FFT structure, but other forms are
possible simply by rearranging the branches of
the signal flowgraph
Consider the rearranged signal flow diagrams
on the following panels …..
Alternate DIT FFT structures (continued)
DIT structure with input bit-reversed, output
natural (OSB 9.10):
Alternate DIT FFT structures (continued)
DIT structure with input natural, output bit-reversed
(OSB 9.14):
Alternate DIT FFT structures (continued)
DIT structure with both input and output natural
(OSB 9.15):
Alternate DIT FFT structures (continued)
DIT structure with same structure for each stage
(OSB 9.16):
Comments on alternate FFT structures
A method to avoid bit-reversal in filtering
operations is:
y Compute forward transform using natural input, bit-
reversed output (as in OSB 9.10)
y Multiply DFT coefficients of input and filter response
(both in bit-reversed order)
y Compute inverse transform of product using bit-
reversed input and natural output (as in OSB 9/14)
Latter two topologies (as in OSB 9.15 and 9.16)
are now rarely used
Using FFTs for inverse DFTs
We’ve always been talking about forward DFTs in our
discussion about FFTs …. what about the inverse FFT?
N −1 N −1
1
x[n] = N ∑ X[k]WN−kn ; X[k] = ∑ x[n]WNkn
k =0 n=0
One way to modify FFT algorithm for the inverse DFT
computation is:
W k
W −k
y Replace N by N wherever it appears
y Multiply final output by 1/ N
This method has the disadvantage that it requires
modifying the internal code in the FFT subroutine
A better way to modify FFT code for inverse DFTs
Taking the complex conjugate of both sides of the
IDFT equation and multiplying by N:
N −1 N−1 *
1
Nx *[n] = N ∑ X *[k ]WN N∑
kn ; or x[n] = 1 X *[k]W kn
N
k=0 k=0
This suggests that we can modify the FFT algorithm
for the inverse DFT computation by the following:
y Complex conjugate the input DFT coefficients
y Compute the forward FFT
y Complex conjugate the output of the FFT and multiply by 1/N
This method has the advantage that the internal FFT
code is undisturbed; it is widely used.
Principle
Sinusoidal signals
Windowing: resolution and leakage (I)
Windowing: resolution and leakage (II)
Windowing: resolution and leakage (III)
Windowing: resolution and leakage (IV)
Frequencies matched to sampling points
Frequencies matched to sampling points
N.B: if Hamming window:
Increasing the resolution
Not by increasing L, the DFT size:
Increasing the resolution(II)
Increasing the resolution(III)
Increasing the resolution(IV)
Increasing the resolution(V)
Increasing the resolution(VI)
Effect of window type
Use of other windows can reduce the contamination between peaks:
Effect of window type (II)
Effect of window type (III)
• Kaiser window: prediction formulae for N and β from the
values of
– main lobe width
– relative attenuation of side lobes
Summary
We developed the structure of the basic
decimation-in-time FFT
Use of the FFT algorithm reduces the number
of multiplys required to perform the DFT by
a factor of more than 100 for 1024-point
DFTs, with the advantage increasing with
increasing DFT size