Lecture 4 – Logical Effort
Outline
• Motivation
• Model the delay of one gate
• Delay of a chain of gates (multistage)
• Branching
• Minimum delay
• Best number of stages and gate sizing
• Examples
• Limitations
Logical Effort - 2
1
Logical Effort Motivation
• Sizing of a chain of inverters
– Geometric progression
• How about more complex logic?
• Logical Effort objectives:
– Quick & dirty, back of the envelope sizing
– Make trade-off between circuits
– What is the best circuit topology for a function?
– How many stages of logic give least delay?
– How wide should the transistors be?
• Reference:
– I. Sutherland, B. Sproull, D. Harris, Logical Effort - Designing
Fast CMOS Circuits Academic Press, 1999
Logical Effort - 3
Delay in a Logic Gate
• Express delays in process-independent unit 𝑡!
• Delay has two components: d = f + p 𝑑=
𝜏
• f: effort delay = gh (a.k.a. stage effort) t = 0.69*3RC
– Again has two components » 3 ps in 65 nm process
• g: logical effort 60 ps in 0.6 µm process
– Measures relative ability of gate to deliver current
– g º 1 for inverter
• h: electrical effort = Cout / Cin
– Ratio of output to input capacitance
– Sometimes called fanout
• p: parasitic delay
– Represents delay of gate driving no load
– Set by internal parasitic capacitance
Logical Effort - 4
2
Units of effort
D
D
gg
p
p
h
• Reference is the inverter:
𝑔"#$ = 1
𝑝"#$ = 1
• g is a function of the complexity of a gate, not its size
• p is a function of the technology and gate type
• t ~ 9ps in typical 130nm process
Logical Effort - 5
Computing Logical Effort
• DEF: Logical effort is the ratio of the input capacitance of a gate to the
input capacitance of an inverter delivering the same output current (i.e,
effort delay under same loading).
• Measure from delay vs. fanout plots
• Or estimate by counting transistor widths
2 2 A 4
Y
2 B 4
A 2
A Y Y
1 B 2 1 1
Cin = 3 Cin = 4 Cin = 5
g = 3/3 g = 4/3 g = 5/3
Logical Effort - 6
3
Logical effort (g)
Number of inputs
Gate type 1 2 3 4 5 n
Inverter 1
NAND 4/3 5/3 6/3 7/3 ( n + 2 )/3
NOR 5/3 7/3 9/3 11/3 ( 2n + 1 )/3
Logical Effort - 7
Computing Parasitic Delay
• Measure from delay vs. fanout plots
• Or estimate by counting self loading on output node for sizing with
equal output current to inverter
• For simplification, we ignore internal node capacitance
2 2 A 4
Cout
2 B 4
A 2
A Cout
1 B 2 1 1 Cout
Cout = 3 Cout = 6 Cout = 6
p = 3/3 p = 6/3 = 2 p = 6/3 = 2
Logical Effort - 8
4
Parasitic Delay (p)
Gate type Parasitic Delay
Inverter 1
n-input NAND n
n-input NOR n
Logical Effort - 9
Delay components
I. Sutherland, et al, Logical Effort, Academic Press, 1999
Logical Effort - 10
10
5
FO4 Example
Cout = 4Cin
D h=4
g =1
p =1
D = 4 + 1 = 5t
FO4 delay ~ L/3 ps where L = technology node in nm
Ex: 130nm à 43ps, 45nm à 15ps
Logical Effort - 11
11
More complex circuit
g=9/3
Cin=3/2
ℎ=3
6
D 𝑔=
Cin=1/2 3
g=1 𝑝=4
Cin=1 𝐷 = 10𝜏
g=6/3
Cin=1
Logical Effort - 12
12
6
Multistage effort
C1 C2 C3
Define path effort H as: C2
h1 =
C3 C1
H= H = h1h2
C1 C3
h2 =
C2
Logical Effort - 13
13
Branching
S=1 Without branching: h1h2 =H
C
h1 S=1 With branching: h2 = 3
h2 C2
C3=1
C1=1 C2=1 3C
h1 = 2
C1
S=1
h1h2 = 3H
Need to account for branching!
𝐶%#&!'(),+ = 𝐶,
Con- path + Coff - path 𝐶%--&!'(),+ = 2𝐶,
bi = 𝑏+ =
3𝐶,
=3
Con- path 𝐶,
Logical Effort - 14
14
7
Equivalent Path Efforts
Cout
H=
Cin Path Effort
B = Õ bi
F = GBH = Õ gi hi
BH = Õ hi
G = Õ gi
P = å pi
Path Effort:
– Does not change with added inverters
– Does not depend on sizes, but on topology
Logical Effort - 15
15
Minimum delay
Total delay D = g1h1 + p1 + g2h2 + p2 D
H C1 C2 C3
Substitute H D = g1h1 + p1 + g 2 + p2
h1
¶D H
min{D} Þ = g1 - g 2 2 = 0 Minimize the delay
¶h1 h1
h2
g1 - g 2 =0 Solve for the minimum delay
h1
g1h1 = g 2 h2
Delay is optimal when
f1 = f 2 effort delays are equal
Logical Effort - 16
16
8
Minimum Delay
F = GBH = Õ gi hi
Stage effort of stage i: fi gi hi = f i
Optimal stage effort is: fˆ = f i
For N stages: F = fˆ N Þ fˆ = F 1 N
Minimum Delay: Dˆ = å g i hi + pi
i
Dˆ = Nfˆ + å pi = Nfˆ + P
i
Logical Effort - 17
17
Example to compute min. delay
a b
Cin=4 c
CL=108
𝐺 = 𝑔' 𝑔. 𝑔/ = 4⁄3 0 = 2.37
𝐵=1
𝐻 = 𝐶1 ⁄𝐶"# = 27
𝐹 = 𝐺𝐵𝐻 = 2.37×27 = 63.99
𝑓: = 𝐹 +⁄0 ≈ 4
= =3 4 +3 2
𝐷
= = 12 + 6 = 18𝜏
𝐷
Logical Effort - 18
18
9
Stage sizing
Compute: fˆ = g × h
hi = fˆ gi = Cout Cin
giCout
Cin =
fˆ
Work backwards to size each gate
g nand Cout
a b Cin,c =
Cin=4 c
CL=108 fˆ
Logical Effort - 19
19
Stage sizing example
a b
Cin=4 c
CL=108 2 2
18 18
g nand Cout
Cin,c =
fˆ 2
18
4 3 ´108
Cin,c = = 36 2
4
18
4 3 ´ 36
Cin,b = = 12
4
4 3 ´12 2
Cin,a = =4 Cin = ´ 36
4 4
Check work by verifying input Cin = 18
cap spec is met!
Logical Effort - 20
20
10
Number of stages
• Path effort F can be used to determine the optimal number of stages
– Assuming we add n2 inv to n1 stages of logic
• New number of stages N=n1+n2
• G, B, H don’t change - F is fixed
• But P increases
#!
= = 𝑁𝐹 +⁄3 + A 𝑝" + 𝑛,
𝐷
"
Optimum is technology dependent
Logical Effort - 21
21
Best number of stages for 𝑝"#$ = 1
Best number of Min. delay
Path effort F Stage effort f
stages fˆ D̂
0 1.0
1 0-5.8
5.83 6.8
2 2.4-2.7
22.3 11.4
3 2.8-4.4
82.2 16.0
4 3.0-4.2
300 20.7
5 3.1-4.1
1090 25.3
6 3.2-4.0
3920 29.8
7 3.3-3.9
14200 34.4
Logical Effort - 22
22
11
Summary of the Method
Gate level Path
Cout - path
p H=
Parasitic delay Path electrical effort Cin- path
Logical effort g G = Õ gi
Path logical effort
Cout Con - path + Coff - path
Electrical effort h= Branch factor, stage bi =
Cin Con - path
Stage effort f = gh Branch effort, path B = Õ bi
Stage delay d= f+p BH = Õ hi
Path effort F = GBH
Logical Effort - 23
23
Method
Path effort F = GBH
Add buffers – determine optimal number of stages
Optimal stage effort fˆ = F 1 N
Optimal path delay Dˆ = NF 1 N + P
g i Cout - stage
Stage sizing Cin =
fˆ
Size xtors in gate
Logical Effort - 24
24
12
Real Example
• Design the address decoder for a register file. A[3:0] A[3:0]
32 bits
– 16 words, 32 bits/word
4:16 Decoder
– Each bit presents load of 3 unit-sized xtors
16 words
16
Register File
– True and complementary address inputs A[3:0]
– Input driver has input cap of 10 unit-sized xtors
A[3] A[3] A[2] A[2] A[1] A[1] A[0] A[0]
10 10 10 10 10 10 10 10
y z word[0]
96 units of wordline capacitance
y z word[15]
Logical Effort - 25
25
Number of Stages
Electrical Effort: H = (32*3) / 10 = 9.6
Branching Effort: B=8
Logical Effort: G=2
Path Effort: F = GBH = 153.6
Number of Stages: N=3
Path delay: fˆ = F 1/ 3 = 153 .41/ 3 = 5.35
D = 3 fˆ + 2 + 4 = 22.1
Sizing:
Output inv: Cin = 96 * 1 / 5.35 = 18
Nand4: Cin = 18 * 2 / 5.35 = 6.7
Input inv: Cin = 6.7 * 1 * 8 / 5.35 = 10
Logical Effort - 26
26
13
Comparison
• Compare many alternatives with a spreadsheet
Design N G P D
NOR4 1 3 4 234
NAND4-INV 2 2 5 29.8
NAND2-NOR2 2 20/9 4 30.1
INV-NAND4-INV 3 2 6 22.1
NAND4-INV-INV-INV 4 2 7 21.1
NAND2-NOR2-INV-INV 4 20/9 6 20.5
NAND2-INV-NAND2-INV 4 16/9 6 19.7
INV-NAND2-INV-NAND2-INV 5 16/9 7 20.4
NAND2-INV-NAND2-INV-INV-INV 6 16/9 8 21.6
Logical Effort - 27
27
Wrong number of stages
Higher number of stages than
optimal is “less worse”
1.51
1.26
Logical Effort - 28
28
14
Wrong gate size
Penalty is the same in any
sizing direction
1.133
1.044
Single gate
is sized wrong
Logical Effort - 29
29
P/N ratio
• Why use P/N = 2?
– Noise margins are balanced
– ~ Equal slopes
• How about P/N = 1.5?
2 2 1.5 1.5
1 1 1 1
Logical Effort - 30
30
15
Limitations – Internal capacitance
• Capacitance in internal nodes
• Body effect
top
out
Internal node
Logical Effort - 31
31
Limitations of Logical Effort
• Chicken and egg problem
– Need path to compute G
– But don’t know number of stages without G
• Simplistic delay model
– Neglects input rise time effects
• Interconnect
– Iteration required in designs with wire
• Maximum speed only
– Not minimum area/power for constrained delay
Logical Effort - 32
32
16
Sizing tool
• Tool: TILOS [Dunlop 89]
– Start with all transistors of min. size
– Find critical path (Optimize path)
– Compute delays
– Increase size of “critical path”
– Size transistor with best sensitivity in critical path
– Repeat
– Goal of path distribution à All paths equal in length…
Logical Effort - 33
33
Area - Delay
Area No. Optimized
paths Original
Delay Delay
But the impact of process variations can be worse for the
optimized paths
Logical Effort - 34
34
17
Your Homework
• Characterize our process
– Inverter, NAND2/3/4, NOR2/3/4
– Find t for process, p and g for each gate
Logical Effort - 35
35
Tau for 130nm Tech
Inverter Delays Tau's
70 12
60
10
50
Nom
Delay (ps)
8
40 inal Nom
Tau (ps)
inal
Fast
30 6 Fast
-fast
-fast
20
4
10
2
0
0 1 2 3 4 5
0
D(1)-D(0) D(2)-D(1) D(3)-D(2) D(4)-D(3) D(5)-D(4)
Fanout
Logical Effort - 36
36
18
Tau and Pinv for 130nm Tech
P_inv
4.5 Wp/Wn No Fast slow
4 m
3.5
3
Average 320/160 6.26 5.74 6.98
Nomi
nal Tau 560/280 6.31 5.66 7.14
P_inv
2.5 Fast-
fast value
2
Slow- 1120/560 6.37 5.55 7.14
1.5 slow
1 Average 320/160 3.46 2.64 4.24
0.5 Pinv 560/280 3.27 2.86 3.75
0
320/ 160 560/ 280 1120/ 560 1120/560 3.42 2.77 3.93
Wp/Wn
Logical Effort - 37
37
Best number of stages 𝑝"#$ = 3.38
Best number of Min. delay
Path effort F Stage effort f
stages fˆ D̂
0 3.38
1 0-9.57
9.57 12.9
2 3.09-7.38
54.4 21.5
3 3.79-6.65
294 30.1
4 4.14-6.29
1563 38.7
5 4.35-6.07
8246 47.3
6 4.49-5.93
43327 55.8
7
Logical Effort - 38
38
19
Summary
• Logical effort is useful for thinking of delay in circuits
– Numeric logical effort characterizes gates
– NANDs are faster than NORs in CMOS
– Paths are fastest when effort delays are ~4
– Path delay is weakly sensitive to stages, sizes
– But using fewer stages doesn’t mean faster paths
– Delay of path is about log4F FO4 inverter delays
– Inverters and NAND2 best for driving large caps
• Provides language for discussing fast circuits
– But requires practice to master
Logical Effort - 39
39
20