Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
10 views30 pages

Lec 14

The document covers various types of adders used in digital design, including ripple adders, carry-select adders, and carry-lookahead adders, detailing their designs, delays, and costs. It explains how different architectures optimize for speed and efficiency, particularly in the context of FPGA implementations. The final section summarizes the performance characteristics of each adder type, emphasizing the trade-offs between cost and delay.

Uploaded by

Grace Zhang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views30 pages

Lec 14

The document covers various types of adders used in digital design, including ripple adders, carry-select adders, and carry-lookahead adders, detailing their designs, delays, and costs. It explains how different architectures optimize for speed and efficiency, particularly in the context of FPGA implementations. The final section summarizes the performance characteristics of each adder type, emphasizing the trade-offs between cost and delay.

Uploaded by

Grace Zhang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

inst.eecs.berkeley.

edu/~eecs15
1

EECS151 : Introduction to Digital Design and ICs


Lecture 13 – Adders
Chris Fletcher
Bora Nikolić

Tuesdays and Thursdays 9:30-11am


Mulford 159 and webcast

EECS151 L13 ADDERS 1


Adder review, subtraction, carry-select

EE141
4
4-bit Adder Example
❑ Motivate the adder circuit design by hand addition:

• Add a1 and b1 as follows:

❑ Add a0 and b0 as follows:

carry to next
stage

r = a XOR b = a  b
c = a AND b = ab r = a  b  ci
co = ab + aci + bci

5
Carry-ripple Adder Revisited
❑ Each cell:
ri = ai  bi  cin
cout = aicin + aibi + bicin = cin(ai + bi) + aibi

❑ 4-bit adder: “Full adder cell”

❑ What about subtraction?

8
Subtractor/Adder
A - B = A + (-B)
How do we form -B?
1. complement B
2. add 1

9
Delay in Ripple Adders
❑ Ripple delay amount is a function of the data inputs:
1 0 0 0 0 0 0 1 1 0 1 0 1 0 1 1

0 0 0 0 0 0 0 1
t0 t2
1 0 0 1 1 0 0 1 1 0 1 0 1 0 1 1

0 0 0 0 0 0 1 1
t1 t3

❑ However, we usually only consider the worst case delay on the critical path.
There is always at least one set of input data that exposes the worst case delay.
10
Adders (cont.)
Ripple Adder

Ripple adder is inherently slow because, in worst case


s7 must wait for c7 which must wait for c6 …

T  n, Cost  n

How do we make it faster, perhaps with more cost?

11
Carry Select Adder

T = Tripple_adder / 2 + TMUX
COST = 1.5 * COSTripple_adder+ (n/2 + 1) * COSTMUX

12
Carry Select Adder
❑ Extending Carry-select to multiple blocks

❑ What is the optimal # of blocks and # of bits/block?


▪ If blocks too small delay dominated by total mux delay
▪ If blocks too large delay dominated by adder ripple delay
T  sqrt(N),
Cost 2*ripple + muxes
13
Carry Select Adder

❑ Compare to ripple adder delay:


Ttotal = 2 sqrt(N) TFA – TFA, assuming TFA = TMUX
For ripple adder Ttotal = N TFA
“cross-over” at N=3, Carry select faster for any value of N>3.
❑ Is sqrt(N) really the optimum?
▪ From right to left increase size of each block to better match delays
▪ Ex: 64-bit adder, use block sizes [12 11 10 9 8 7 7], the exact answer depends on the
relative delay of mux and FA

(note: one less block than sqrt(N) solution)


14
Carry-lookahead and Parallel Prefix

EE141
15
Tricks with Trees

EE141
16
Reductions with Trees

log2 N

N
If each node (operator) is k-ary instead of binary, what is the delay?

Demmel - CS267 Lecture 6+ 17


Trees for optimization
x1 x2 x3 x4 x5 x6 x7
x0 + + + + + + + T = O(N)

((((((x0 + x1 ) + x2 ) + x3 ) + x4 ) + x5 ) + x6 ) + x7

+ + + +
+ + T = O(log N)
+

(( x0 + x1 ) + ( x2 + x3 )) + (( x4 + x5 ) + ( x6 + x7 ))
❑ What property of “+” are we exploiting?
❑ Other associate operators? Boolean operations? Division? Min/Max?
18
Parallel Prefix, or “Scan” x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x x12 x13 x14 x15
11

❑ If “+” is an associative operator, and x0,…,xp-1


are input data then parallel prefix operation
computes:

x0, x0 + x1, x0 + x1 + x2, …


yj = x0 + x1 + … + xj for j=0,1,…,p-1

y0 y1 y2 y3 y4 y5 y6 y7 y8 y9 y10 y y12 y13 y14 y15


11

19
Carry Look-ahead Adders
❑ How do we arrange carry generation to be associative?
❑ Reformulate basic adder stage:

a b ci ci+1 s
carry “kill”
ki = ai’ bi’

carry “propagate”
pi = ai  bi
ci+1 = gi + pici
carry “generate”
gi = ai bi si = pi  ci

20
Carry Look-ahead Adders
❑ Ripple adder using p and g signals:
pi = ai  bi
c0 gi = ai bi

a0 p0 s0 = p0  c0 s0
b0 g0 c1 = g0 + p0c0

a1 p1 s0 = p1  c1 s1
b1 g1 c2 = g1 + p1c1

a2 p2 s2 = p2  c2 s2
b2 g2 c3 = g2 + p2c2

a3 p3 s3 = p3  c3 s3
b3 g3 c4 = g3 + p3c3

c4

❑ So far, no advantage over ripple adder: T  N

21
Carry Look-ahead Adders
❑ “Group” propagate and generate signals:
pi cin
gi
pi+1
gi+1
P = pi pi+1 … pi+k
G = gi+k + pi+kgi+k-1 + … + (pi+1pi+2 … pi+k)gi

pi+k
gi+k cout

❑ P true if the group as a whole propagates a carry to cout


❑ G true if the group as a whole generates a carry
cout = G + Pcin
❑ Group P and G can be generated hierarchically.

22
Carry Look-ahead Adders
c0

a0 9-bit Example of hierarchically


Pa
b0 generated P and G signals:
a1
b1 a
a2 Ga
b2
P = PaPbPc
c3 = Ga + Pac0
a3
b3 Pb
a4
b4 b
a5
b5 Gb

c6 = Gb + Pbc3
a6
b6 Pc G = Gc + PcGb + PbPcGa
a7
b7 c
a8
Gc
c9 = G + Pc0
b8

23
c0
c0
a0 p,g ci p=ab
b0 g = ab
s0 ai
P,G bi p,g
c1 c0
a1
b1
si s = p  ci
s1 ci+1
c2 ci+1 = g + cip
a2
b2
s2
a3
c3 c0 8-bit Carry Look-
b3
s3 ahead Adder
c4 P,G
a4 Blocks without the slash, don’t
b4
s4 c8 perform the carry operation
c5 c0
a5
b5 cin
s5 P = PaPb
c6
Pa,Ga G = Gb + GaPb
a6 P,G
b6
s6 Pb,Gb Cout = G + cinP
c7
a7 cout
b7

24
c0 c0 8-bit Carry Look-ahead
p0 P8=p0p1
s0 g0 Adder with 2-input gates.
c = g +p c G8=g1+p1g0
p1 1 0 0 0 Pc=P8P9
s1 g1
c2=G8+P8c0
c2 Gc=G9+P9G8
p1 P9=p2p3
g2 c4=Gc+Pcc0
s2 Pe=PcPd
c = g +p c G9=g3+p3g2
p3 3 2 2 2
g Ge=Gd+PdGc
s3 3 c4
c4
p4 Pa=p4p5 c8=Ge+Pec0
s 4g
4
c = g +p c Ga=g5+p5g4
p5 5 4 4 4 Pd=PaPb
s5 g5 c6=Ga+Pac4
c Pb=p6p7 Gd=Gb+PbGa
p6 6
s6 g6
c = g +p c
p7 7 6 6 6 Gb=g7+p7g6
s7 g7
c8

25
Parallel-Prefix Review Ex: AND reduction
Lowest delay for a reduction is a balanced tree.
• In cases where all intermediate values are required,
• one way is to use “Parallel Prefix” :

y0 = x0
y1 = x0x1
log2n y2 = x0x1x2
.
.
.

log2n

Can carry generation be made to be a kind of “reduction operation”?


Parallel Prefix requires that the operation be associative, but simple carry generation is not! 26
Parallel-Prefix Carry Look-ahead Adders
❑ Ground truth specification of all carries directly (no grouping):
c0 = 0
c1 = g0 + p0c0 = g0
c2 = g1 + p1c1 = g1 + p1g0 ci+1 = gi + pici
c3 = g2 + p2c2 = g2 + p2g1 + p1p2g0
c4 = g3 + p3c3 = g3 + p3g2 + p2p3g1 + p1p2p3g0

Assumes carry signal


Binary (G, P)
moving from right to
associative operator left. Not communitive.

Can be used to form all carries!

Use binary (G,P) operator to form parallel prefix tree


27
Parallel Prefix Adder Example
g3 p3 g2 p2 g1 p1 g0 p0

G = g3 + g2 p3 G = g2 + g1 p2 G = g1 + g0 p1
P = p3p2 P = p2p1 P = p1p0

c1

c2

G = g2 + g1 p2 + g0p2p1 si = ai  bi  ci = pi  ci
= c3
G = g3 + g2 p3 +(g1 + g0p1)p3p2
= g3 + g2p3 + g1p3p2 + g0p3p2p1
= c4
28
Other Parallel Prefix Adder Architectures

Kogge-Stone adder: minimum logic depth, Ladner-Fischer adder: minimum logic


and full binary tree with minimum fan-out, depth, large fan-out requirement up to n/2
resulting in a fast adder but with a large area

Han-Carlson adder: hybrid design


Brent-Kung adder: minimum area, but high
combining stages from the Brent-Kung and
logic depth Kogge-Stone adder
29
Carry look-ahead Wrap-up
❑ Adder delay (logN).
❑ Cost?
❑ Can be applied with other techniques. Group P & G signals
can be generated for sub-adders, but another carry
propagation technique (for instance ripple) used within the
group.
▪ For instance on FPGA. Ripple carry up to 32 bits is fast, CLA used to
extend to large adders. CLA tree quickly generates carry-in for upper
blocks.

30
Bit-serial Addition, Adder
summary

EE141
31
Bit-serial Adder
❑ Addition of 2 n-bit numbers:
▪ takes n clock cycles,
▪ uses 1 FF, 1 FA cell, plus registers
▪ the bit streams may come from or go to other circuits, therefore the registers
might not be needed.

• A, B, and R held in shift-registers.


Shift right once per clock cycle.
• Reset is asserted by controller.

32
Adders on FPGAs
• Dedicated carry logic
provides fast arithmetic
carry capability for high-
speed arithmetic functions.
• On Virtex-5
• Cin to Cout (per bit)
delay = 40ps, versus
900ps for F to X delay.
• 64-bit add delay =
2.5ns.

33
Adder Final Words Type Cost Delay
Ripple O(N) O(N)
Carry-select O(N) O(sqrt(N))
Carry-lookahead O(N) O(log(N))
Bit-serial O(1)* O(N)
* not counting shift registers
❑ Dynamic energy per addition for all of these is O(n).
❑ “O” notation hides the constants. Watch out for this!
❑ The “real” cost of the carry-select is at least 2X the “real” cost of the ripple. “Real” cost of
the CLA is probably at least 2X the “real” cost of the carry-select.
❑ The actual multiplicative constants depend on the implementation details and technology.
❑ FPGA and ASIC synthesis tools will try to choose the best adder architecture automatically -
assuming you specify addition using the “+” operator, as in “assign A = B + C”

34

You might also like