Pipelining and Parallel processing
Architectures
Bishnu Prasad Das
Department of ECE
IIT Roorkee
1
VLSI Architectures
• Pipelining
– Reduce the critical path
– Increase the clock speed or sample speed
– Reduce power consumption
• Parallel Processing
– Not reduce the critical path
– Not increase clock speed, but increase sample speed
– Reduce power consumption
2
3
4
5
6
7
A 3-tap FIR Filter
• Direct form structure
What is the
critical path in
this structure?
8
Pipelining and Parallel Concept
• Pipelining
– Introduce pipelining latches
along the datapath
• Parallel processing
– Duplicate the hardware
9
Pipelining FIR Filter
• Critical path
– (2TA + TM)→( TA + TM )
10
Pipelining
• Drawbacks
– Increase number of delay elements (registers/latches)
in the critical path
– Increase latency
• Clock period limitation: critical path may be
between
– An input and a latch
– A latch and an output
– Two Latches
– An input and an output
• Pipelining latches can only be placed across any
feed-forward cutset of the graph
11
12
Data-Broadcast Structure
13
Fine-gain Pipelining
• Let TM=10 u.t., TA=2 u.t., and the desired clock period=6 u.t.
• Break the MULTIPLIER into 2 smaller units with processing
time of 6 and 4 units.
14
Parallel Processing
• Parallel processing and pipelining are dual
• If a computation can be pipelined, it can also be processed in
parallel.
• Convert a single-input single-output (SISO) system to multiple-input
multiple-output (MIMO) system via parallelism
15
Parallel Processing of 3-Tap FIR Filter
16
17
Underlying Low Power Concept
18
Pipelining for low power
19
Pipelining for Low-Power
20
Pipelining Example
• Consider an original 3-tap FIR filter and its fine-grain
pipeline version shown in the following figures. Assume
TM=10 ut, TA=2 ut, Vt=0.6V, Vo=5V, and CM=5CA. In fine-
grain pipeline filter, the multiplier is broken into 2 parts,
m1 and m2 with computation time of 6 u.t. and 4 u.t.
respectively, with capacitance 3 times and 2 times that of
an adder, respectively.
(a) What is the supply voltage of the pipelined filter if the
clock period remains unchanged?
(b) What is the power consumption of the pipelined filter as
a percentage of the original filter?
21
Pipelining Example
22
Pipelining Example
23
Parallel Processing for Low Power
24
Parallel Processing for Low Power
25
Parallel Processing Example
• Consider a 4-tap FIR filter shown in Fig. 3.18(a) and its 2-parallel
version in 3.18(b). The two architectures are operated at the sample
period 9 u.t. Assume TM=8, TA=1, Vt=0.45V, Vo=3.3V, CM=8CA
(a)What is the supply voltage of the 2-parallel filter?
(b)What is the power consumption of the 2-parallel filter as a
percentage of the original filter?
26
Parallel Processing Example
27
Parallel Processing Example
28
Conclusion
• Methodologies of using pipelining and parallel
processing for low power applications
29
Reference
1. K. K. Parhi, “VLSI Digital Signal Processing Systems: Design and
Implementation”, John Wiley & Sons, 1999. (Chapter 3)
30