Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
286 views202 pages

EE405 Lectures

This document provides information about the EE 405 VLSI Circuit Design course taught by Dr. Mohamed Abbas at King Saud University. The course is an introduction to VLSI circuit design and will cover topics such as MOS transistor theory, CMOS logic design, layout and design rules, circuit simulation and characterization, combinational and sequential circuit design, and memory system design. Assessment will include midterm exams, quizzes, activities, and a final exam. Students will learn how to design and layout CMOS circuits using CAD tools and simulate their operation using SPICE. The goal is for students to understand VLSI concepts and be able to implement circuit designs.

Uploaded by

Abdooo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
286 views202 pages

EE405 Lectures

This document provides information about the EE 405 VLSI Circuit Design course taught by Dr. Mohamed Abbas at King Saud University. The course is an introduction to VLSI circuit design and will cover topics such as MOS transistor theory, CMOS logic design, layout and design rules, circuit simulation and characterization, combinational and sequential circuit design, and memory system design. Assessment will include midterm exams, quizzes, activities, and a final exam. Students will learn how to design and layout CMOS circuits using CAD tools and simulate their operation using SPICE. The goal is for students to understand VLSI concepts and be able to implement circuit designs.

Uploaded by

Abdooo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 202

EE 405

VLSI Circuit Design: An Introduction


Instructor: Mohamed Abbas, PhD
Associate Professor, Electrical Engineering Department
College of Engineering, King Saud University

About the Instructor


• Hold PhD In the area of VLSI Circuits Design, The
University of Tokyo- Japan, Sept. 2006.
• Worked for about 2.5 years in VLSI Design and
Education Center (VDEC), the University of Tokyo Japan
in collaboration with ADVANTEST Corporation Japan.
• Had many tape outs in 0.35Pm, 0.18 Pm and 65nm
technologies.
• Taught such courses for under and post graduate
students in Assiut University , E-JUST Egypt and KSU.
• Have extensive researches in the area of digital, Low
power and Mixed circuits design

2
Why do you take this course?

?????

What is Integrated Circuit ?


• An electronic circuit is composed of
individual electronic components, as
resistors, capacitors, inductors diodes and
transistors, connected by conductive wires.
• An integrated circuit or monolithic integrated
circuit (AKA IC, a chip, or a microchip) is a set
of electronic circuits on one small flat piece (or
"chip") of semiconductor material,
normally silicon [wikipedia.org].

4
Integration level
(Analog) (Digital)
Name Signification Year Transistors Logic gates
number number
small-scale
SSI 1964 1 to 10 1 to 12
integration
medium-scale
MSI 1968 10 to 500 13 to 99
integration
large-scale
LSI 1971 500 to 20 000 100 to 9999
integration
very large-scale 20 000 to 1 000 10 000 to 99
VLSI 1980
integration 000 999
ultra-large-
1 000 000 and 100 000 and
ULSI scale 1984
more more
integration

https://en.wikipedia.org/wiki/Integrated_circuit

Examples of IC Packages

6
IC Products
• Processors
– CPU, DSP, Controllers
• Memory Chips
– RAM, ROM, EEPROMS
• Analog
– Mobile communications,
audio/video processing
• Embedded systems
– Used in cars, factories
– Network cards
• System-on-Chips (SOC)
7

What does this course provide you?


This Course
• Provides knowledge and Intellectual for you to :
– Be qualified to join a VLSI design Group where you
can take part in Developing an IC Product.
– Implement an idea ,for a specific design in your
mind, into an IC and hence help you patent or to
start a business.

8
IC Design Flow
• Idea generation
• Drafting on paper (Schematics)
• Simulation to test
• Layout using the target Technology design suites.
• Extract the netlist and simulate to test the functionality
(Post Layout Simulation).
• Up on the correctness of the Simulation results, Tape
out to a Foundry to fabricate.
• Post Silicon Test
• Up on the correctness of test results, Start Mass
Production
9

Full Chip Layout Cadence Virtuoso Layout Editor

10
Die Micrograph

11

How to do it?

You need to:


• Understand the concepts of VLSI.
• Acquire the methods of VLSI Circuit design
• Know how to use ?
– CADs (EDA tools) for Electronic Circuit Simulation,
Circuit Layout and Design Verification .

That is what you will


learn in this Course

12
Covered Topics

• Introduction to VLSI systems.


• MOS transistor theory
• CMOS logic design and fabrication
• Layout and design rules
• Electronic circuit analysis using spice.

13

Covered Topics.. Cont.

• Circuit characterization and


performance estimation.
• Combinational and sequential circuit
design.
• Static and dynamic CMOS gates.
• Memory system design.

14
Pre- and co-requisites
• Microelectronic devices and circuits (EE310)
• Highly recommended to take EE 406 in parallel
with this course

15

Course Objectives
At the end of the course the student should be
able to:-
• Use the transistor-level circuit simulator such as
SPICE.
• understand the theory, operation and trends of
CMOS technology.
• Understand the manufacturing steps of CMOS
devices.
• Understand the concepts of digital Circuit
integration.

16
Course Objectives Cont…
• layout and verify the operation of a CMOS
Circuit.
• calculate propagation delay, noise margins, and
power dissipation in the digital VLSI circuits.
• design Combinational (e.g., arithmetic) and
sequential circuit.
• understand the concepts of Memory in VLSI
Circuit.
• design Memory in VLSI circuits.

17

References
• Text Book:
– CMOS VLSI Design- A Circuits and Systems Perspective. By:
Neil H. Weste and David M. Harris , 4th Edition, Addison-
Wesley, 2011.
• References:
– PSPICE Tutorials
http://www1bpt.bridgeport.edu/~thapa/eleg458/tut/p
spice/PSPICE_MOS_TUT5.pdf

18
Needed (CADs) EDAs
• PSPICE Student Version
http://www.engr.uky.edu/~cathey/pspice06130
1.html

19

Student Assessment Criteria

Item Grade %
First Mid-Term 20%
Second Mid-Term 20%
Quizzes &Activities 20%
Final Exam 40%

20
Find Me
• Dr. Mohamed Abbas
[email protected]
• Office : 2C27
• Office Hours:
– Sunday: 10:00 ~12:00
– Thursday: 10:00~12:00
• You may make an appointment through e-mail lf
necessary.

21

Mid-Term Exam Dates:


• First Mid-Term Exam:
– Date: Tue 29/9/2020
– Time: 13:00
• Second Mid-Term Exam
– Date: Tue 03/11/2020
– Time: Time: 13:00

22
Lecture Set 3
Lecture-Set
Circuit Simulation Using SPICE

21

Simulation Program for Integrated


Circuits Emphasis
SPICE

22
SPICE - FEATURES
• SPICE uses numerical techniques to solve nodal 
l h l d l
analysis of circuit. It supports the following:
– Textual input to specify circuit & simulation commands 
p p y
– Text or graphical output format for simulation results 
– Circuit elements: 
• Passive elements: 
Passive elements: ‐ Resistors,  Capacitors, Inductors and  
Resistors, Capacitors, Inductors and
Transmission lines 
• Active devices:‐ diodes, BJTs, JFETS and  MOSFETS 
• Independent and Dependent V, I Sources
Independent and Dependent V, I Sources
– Analysis types:
• DC :calculates the DC transfer curve 
• AC: calculates the output as a function of frequency. A 
AC calculates the output as a function of frequency A
bode plot is generated. 
• Transient: : calculates the voltage and current as a 
function of time when a large signal is applied
function of time when a large signal is applied. 
• Noise
23

SPICE – FEATURES Cont.


• Temperature
• Fourier analysis: calculates and plots the frequency spectrum. 
• Monte Carlo Analysis
Monte Carlo Analysis
• SPICE also supports:
–AA wide variety of active device models
wide variety of active device models
– Process parameter variation
– Effects of worst / best case & statistical spreads of 
process
– Design optimization
– Component Libraries
C t Lib i
– Behavioral modeling

24
INPUT FORMAT
• A SPICE file is made up of a series of statements.
– Each statement is on one line, unless continued on to the 
next by starting it with + as the first character.
– Each statement is made up of fields. Fields are separated 
by = ( ) or one or more spaces
by , = ( ) or one or more spaces.
– Fields consist of SPICE key words, SPICE symbols, names 
(alpha‐numeric up to 16 characters) numbers (integer or 
floating point) or scale‐factors.
– The scale‐factors recognized by SPICE are:
• TT = 1E12     G = 1E9     MEG = 1E6    K = 1E3    MIL = 25.4E‐6        M 
= 1E12 G = 1E9 MEG = 1E6 K = 1E3 MIL = 25 4E 6 M
= 1E‐3     U = 1E‐6    N = 1E‐9         P = 1E‐12      F = 1E‐15

25

INPUT FORMAT Cont.


– Letters immediately following a number that are not scale 
factors are ignored, as are letters immediately following a 
scale factor, so 1K, 1000volts, 1KV, 1.0E3 are all the same.
scale factor, so 1K, 1000volts, 1KV, 1.0E3 are all the same.
– Comment lines start with * or $ as the first character; 
– A SPICE file must start with a title statement and finish 
with .END statement.
– The order of statements between start and end is 
arbitrary.
arbitrary

26
INPUT FORMAT Cont.
– SPICE files are structured into the following groups of 
statements in the following order:
• title
• parameters
• circuit description
• input control
• analysis required
• output format  
p
• models
• end  <CR>
–C
Comments should be inserted to make the file easily 
t h ld b i t dt k th fil il
readable and the simulation / design ‘self‐documenting’.

27

Circuit Description
• Independent DC Sources
– Voltage source:
• Vname    +Node   ‐Node   Type  Value
– Current source:
• Iname     +Node   ‐Node   Type   Value
Iname +Node ‐Node Type Value

– Examples: 
Vin  1  0  DC  10 
Is   3  4  DC  1.5 

28
Circuit Description Cont.
• Dependent Sources
– Voltage controlled voltage source: 
• Ename N1 N2 NC1 NC2 Value
– Voltage controlled current source: 
• Gname N1 N2 NC1 NC2 Value
Gname N1 N2 NC1 NC2 Value
– Current controlled voltage source: 
• Hname N1 N2 Vcontrol Value
Hname N1 N2 Vcontrol Value
– Current controlled current source:
• Fname N1 N2 Vcontrol Value
– Examples: 
F1 0 3 Vmeas 0.5 
Vmeas 4 0 DC 0 
29

Circuit Description Cont.


• Resistors
Rname N1 N2 Value
• Capacitors (C) and Inductors (L)
Cname N1 N2 Value <IC> 
Lname N1 N2 Value <IC>

– Example: 
Cap5   3   4    35E‐12      5 
L12      7   3    6. 25E‐3    1m

30
Circuit Description Cont.
• Mutual Inductors
Kname Inductor1 Inductor2 value_of_K

– Example
p

L1   3    5    10M 
3 5 0
L2   4    7      3M 
K L1 L2 0.81
K    L1     L2    0.81

31

Signal Sources
• Sinusoidal sources
Vname N1 N2 SIN(VO VA FREQ TD THETA PHASE) 
• VO ‐ offset voltage in volt. 
• VA ‐ amplitude in volt. 
• f = FREQ ‐ the frequency in herz. 
• TD ‐ delay in seconds 
• THETA damping factor per second 
THETA ‐ d i f t d
• Phase ‐ phase in degrees 
– Example: 
Example:
VG    1   2  SIN(5  10  50  0.2  0.1) 
VG2 3 4 SIN(0 10 50)
VG2  3   4  SIN(0  10   50)

32
Signal Sources Cont.
• Piecewise linear source (PWL)
Vname N1 N2 PWL(T1 V1 T2 V2 T3 V3 ...)
(Ti Vi) specifies the value 
Vi of the source at time Ti 
E
Example: 
l
Vgpwl   1   2  PWL(0  0 10U   5  100U   5   110U   0  
r) 
r)
• Pulse
Vname N1 N2 PULSE(V1 V2 TD Tr Tf PW Period)
( )

33

Semiconductor Devices
• Diode 
Dname N+ N‐ D1N4148 
.model
d l D1N4148 D (IS=0.1PA, RS=16 CJO=2PF  BV=100) 
D1N4148 D (IS 0 1PA RS 16 CJO 2PF BV 100)
• Bipolar transistors 
Qname C B E Q2N2222A
Qname C B E Q2N2222A 
.model Q2N2222A NPN (IS=14.34F  …………………..
• MOSFETS 
Mname ND NG NS NB  ModName   L= ??  W= ??
.model ModName NMOS (KP=    VT0=    lambda=   …….)

34
SubCircuits
• Defining a subcircuit
Example:
SUBCKT SUBNAME N1 N2  * Subcircuit for 741
N3 ...  opamp
Element statements  .subckt opamp741 1 2 3
* +i
+in ((=1)
1) -in
i ((=2)
2) outt ((=3)
3)
.ENDS SUBNAME rin 1 2 2meg
• Using a subcircuit
g rout 4 3 75
vs 1 0 dc 5  e 4 0 1 2 100k
r1 1 2 200  .ends opamp741
rf 2 3 1k 
x1 0 2 3 opamp741 
.dc vs 0 10 1 
.plot dc v(3) 
.end 
d

35

Analysis Types
• DC Analysis
.DC SRCname START STOP STEP
.DC
DC SRCname1 START STOP STEP SRCname2 START
SRCname1 START STOP STEP SRCname2 START
+     STOP STEP 
– Example:       .DC V1    0    20    2  
.DC Vds  0  5  0.5   Vgs  0   5   1
• Transient Analysis
.TRAN TSTEP TSTOP <TSTART <TMAX>> <UIC> 
• AC Analysis
.AC LIN    NP    FSTART   FSTOP 
.AC DEC   ND   FSTART   FSTOP 
.AC OCT   NO   FSTART   FSTOP

Example: .AC DEC 10 1000 1E6
36
Control Statements
• A)  .OP Statement
– voltage at the nodes 
– current in each voltage source 
current in each voltage source
– operating point for each active element 
• B) .TF Statement
•the ratio of output variable to input variable (gain or transfer 
gain) 
• the resistance with respect to the input source 
• the resistance with respect to the output terminals 
– Example: .TF V(3,0) VIN
• C) .IC Statement
IC Statement
– Example: .IC Vnode1 = value Vnode2 = value etc.
• D) .Include Statement
– Example: .Include ‘c:\example\model.txt’

37

Examples:
• 1. : DC Sweep, and Thevenin Equivalent Circuit
*DC SWEEP and Thevenin Eqq
Eqq.
VIN 1 0 DC 10
F1 0 3 VMEAS 0.4
VMEAS 4 0 DC 0
*VMEAS is a 0V source to measure i4
R1 1 2 1K
R2 2 3 10K
R3 1 3 15K
R4 2 4 40K
R5 3 0 50K
.OP
.TF V(3,0) VIN
.DC VIN 0 20 2
.probe
b
.END
38
Example 2 : Simulation of mutual inductances

*Example 2
Vin   1 0 sin(0 5 1000 0 0) 
R 1 3 100
Rs   1 3  100 
Rl   4 0  500 
L1 3 0 10M
L1   3 0  10M 
L2   4 0  2M 
K   L1    L2   0.693 
.TRAN   0.1M  10M 
.PRINT  TRAN  V(3)  V(4) 
.PLOT  TRAN  V(3)  V(4)
.Probe 
.END 

39

Example 3: (AC Analysis)


• Simulation of a first order 
filter
Example AC Analysis 
v1 1 0 ac 1 
r1 1 2 10k 
r2 2 3 100k 
c 2 3 10n
c 2 3 10n 
e1 3 0 0 2 1e8 
.ac dec 10 1 1e4
.ac dec 10 1 1e4 
.plot ac vdb(3) 
.plot ac vp(3) 
.end 
40
Example 3:
• Simulation of a Rectifier Circuit 
(diodes)
rectifier example 
vin 2 0 sin(0 170 60 0 0) 
rl 5 0 500
rl 5 0 500 
rs 2 1 10 
L1 1 0 2000 
L2 3 0 20
L2 3 0 20 
K1 L1 L2 0.99999 
D1 3 5 mod1 
.model mod1 D (IS=1e‐14, n=1) 
.tran 0.2m 20m 
.plot tran v(3), v(5)
.plot tran v(3), v(5) 
.end 

41

Example 4: NPN Transistor Amplifier


*** Example of a NPN transistor 
vin 1 0 ac 1 sin(0 10m 10k)
rs 1 2 1 
c1 2 3 100uf 
rb 5 3 465k 
rc 5 4 3k
rc 5 4 3k 
*Cl 4 out 100u
*RL out 0 10K
vcc 5 0 dc 10 
q1 4 3 0 npn‐trans 
.model npn‐trans npn (is=2e‐15 bf=100 vaf=200) 
p p ( )
.op
.ac dec 10 100 10G 
.TRAN 0.1u 0.2m
TRAN 0 1u 0 2m
.end 
42
Example 5: CMOS Inverter
*CMOS inverter
CMOS Inverter circuit 
.Include ‘model.txt’
Vdd Vdd 0  12
M1 out in vdd vdd CMOSP l=1.2U W=4U 
out 0 0 C OS U
M2 out in 0 0 CMOSN l=1.2U W=1.5U 5U
Vin in 0 dc 1 pulse( 0 12 0 1u 1u 100u 
200u))
.DC vin 0 12 0.01
.TRAN 0.1u 1m
.TRAN 0.1u 1m
.probe
43
End

Model file

.MODEL CMOSN NMOS LEVEL=3PHI=0.600000 TOX=2.1500E‐08 XJ=0.200000U +TPG=1 
VTO=0.8063DELTA=9.4090E‐01 LD=1.3540E‐07 KP=1.0877E‐04 +UO=680.4THETA=8.3620E‐
02 RSH=109.3 GAMMA=0.5487 NSUB=2.3180E+16 +NFS=1.98E+12 VMAX=1.8700E+05 
ETA=5.5740E‐02 +KAPPA=5.9210E‐02 +CGDO=3.2469E‐10CGSO=3.2469E‐10 
+CGBO=3.7124E‐10
+CGBO 3.7124E 10 CJ
CJ=3.1786E‐04
3.1786E 04 +MJ
+MJ=1.0148CJSW=1.3284E‐10
1.0148CJSW 1.3284E 10 +MJSW
+MJSW=0.119521
0.119521 
PB=0.800000 
.MODEL CMOSP PMOS LEVEL=3 PHI=0.600000TOX=2.1500E‐08 XJ=0.200000U +TPG=‐1VTO=‐
0.9403 DELTA=8.5790E‐01LD=1.1650E‐09 KP=3.4276E‐05 +UO=214.4 THETA=1.4010E‐01 
RSH=122 2GAMMA=0 5615 +NSUB=2 4270E+16 +NFS=3 46E+12
RSH=122.2GAMMA=0.5615 +NSUB=2.4270E+16 +NFS=3.46E+12 
VMAX=3.9310E+05ETA=1.5670E‐01 +KAPPA=9.9990E+00 +CGDO=2.7937E‐12 
CGSO=2.7937E‐12+CGBO=3.5981E‐10 CJ=4.5952E‐04 +MJ=0.4845 CJSW=2.7917E‐

10+MJSW=0.365250
10 MJSW 0.365250 PB
PB=0.850000
0.850000

44
Problem 4 sheet 3:
Problem 4 Sheet 3
.subckt opamp 1 2 3
Rin 1 2 10G
E 1 4 0 1 2 1E8
Ro 4 3 0 1
Ro 4 3 0.1
.ends
X1  1  2  3  opamp
X2  4  5  6  opamp
X3 8 7 9
X3  8  7  9  opamp
V1  1  0   DC 10 SIN(0 1 1K)
V2  4  0   DC  5 SIN(0 1 2K)
R1  3  2  10K
R2  5  6  10K
Rgain 2  3  10K
R3  3  7  10k
R4  7  9  10K
R5  6  8 10K
R6  8  0  10K
Rload  9  0 10K
.DC V1 0 10 0.1
.END

.  

45

Reference
PSpice and Circuit  Analysis: 2nd Edition
y
By John Keown

46
Lecture-Set 2
Basic IC Manufacturing Processes

47

Contents
• Basic Fabrication Processes
– Ingot Growth
– Wafer Sawing and smoothing
W f S i d thi
– Photolithography
• Application of Photoresist , Exposure and Development
– Mask Generation
– Etching
– Oxidation
– Doping
• Diffusion
• Ione Implantation
Ione Implantation
– Deposition and Patterning
– Scribing and Cleaving

48
Ingot Growth
• First step in production of an 
integrated circuit is growth of a large 
piece of almost perfectly crystalline
piece of almost perfectly crystalline 
semiconducting material called an 
ingot ( boule)
• Small seed crystal is suspended in 
molten material then pulled (1m/hr) 
and rotated (1/2 rps) to form the
and rotated (1/2 rps) to form the 
ingot
• Result is an ingot approx. 1m long 
and anywhere from 75 to 300 mm in 
diameter
• Dopant is almost always added to 
Dopant is almost always added to
the molten material
49

Ingot Growth

50
Wafer Sawing
• Ingots are then sawed into wafersa pproximately 500‐
1000 μm (0.5 to 1 mm) thick using a diamond tipped saw
• Wafers are the starting material for integrated circuit 
manufacture, and are normally referred to as the 
substrate
b
• Surface of the wafer is smoothed with combination of 
chemical and mechanical polishing steps
chemical and mechanical polishing steps

https://www.youtube.com/watch?v=AMgQ1-HdElM
51

Photolithography
• Lithography refers to the transfer of an image onto 
paper using a plate and ink‐soluble grease.
• Photolithography is the transfer of an image using 
photographic techniques.
• Photolithography transfers designer generated 
information (device placement and interconnections) 
to an actual IC structure using masks which contain the 
geometrical information.
• The process of photolithography is repeated many 
Th f h li h h i d
times in manufacture of an IC to build up device 
structures and interconnections
structures and interconnections.

52
Photolithography –Step1: Application of Photoresist

• First step in photolithography is to 
coat the surface with approx 1 μm of 
photoresist (PR)
h t i t (PR)
• PR will be the medium whereby the 
required image is transferred to the 
q g
surface
• PR is often applied to the center of 
the wafer, which is then spun to force 
h f hi h i h f
the PR over the entire surface
• Note that the scale of these diagrams 
Note that the scale of these diagrams
is not correct ‐the PR is approx. 1 μm 
thick while the wafer is 1000 μm 
thick.
thick

53

Photolithography -Exposure
• The PR is then exposed to UV 
(ultraviolet) radiation through a 
mask.
mask
• The masks generated from 
information about device placement
information about device placement 
and connection.
• The UV radiation causes a chemical 
e U ad at o causes a c e ca
change in the PR.
• The transfer of information from the 
mask to the surface occurs through 
the UV‐induced chemical change ‐
only occurs where the mask is 
l h th ki
transparent.
54
Photolithography -Development
• The PR is then developed using a 
chemical developer
• Two possibilities:
Two possibilities:
– A negative PR is hardened against 
the developer by the UV radiation, 
and hence remains on the surface 
where UV shone through the mask.
– A positive PR is the opposite, it is 
A positive PR is the opposite it is
removed where the UV shone 
through the mask.
• Assume a negative PR for this example, 
so the PR on the sides will be 
weakened and removed by the 
developer.

55

Photolithography -Final Structure


• Once the developer has been washed 
off, the result is PR in the region 
corresponding to the transparent part 
d h
of the mask(the mask is shown again 
to indicate where the final region is 
g
formed –it is not part of the final 
structure)
• Subsequent processing steps will use 
this structure to form device areas, 
interconnects, etc.
interconnects, etc.
• Note that an optically reversed mask 
and a positive resist would give the 
same structure.

56
Etching -Dry and Wet Processes
• Etching is the selective 
removal of material from the 
chip surface
• In dry etching, ions of a neutral 
material are accelerated 
toward the surface and cause 
ejection of atoms of all 
j ti f t f ll
materials 
• In wet etching, a chemical 
In wet etching a chemical
etchant is used to remove 
material via a chemical
material via a chemical 
reaction
57

Etching -Selectivity and Anisotropy


• Two most important issues in etching are selectivity and 
anisotropy
– Selectivity refers to the ability of an etchant to remove 
one material on the surface while leaving another 
intact.
– Isotropic refers to the tendency of the etching to 
proceed laterally as well as downward
proceed laterally as well as downward.

58
Thermal Oxidation -Oxidation Furnace
• One of the simplest steps in IC processing is thermal oxidation, 
the growth of a layer of silicon dioxide (SiO2) on the substrate 
surface
f
• Requires only substrate heating to 900‐1200 °C in a dry (O2) or 
wet (H( 20 steam) ambient using an oxidation furnace
) g
• Silicon oxidizes quite readily ‐one reason why Si is so widely used

59

Thermal Oxidation -Oxide Formation


• Oxide forms due to the chemical 
reaction between oxygen in the 
ambient and silicon in the 
b d l h
substrate
• Substrate silicon is consumed 
Substrate silicon is consumed
during the reaction, so oxide layer 
grows in both directions from the 
original substrate surface (approx. 
50/50)

60
Thermal Oxidation -Wet vs. Dry Rates
• Due to the different reaction 
mechanisms, oxidation in a 
wet ambient is many times 
b
faster than oxidation in a dry 
ambient
• However, the oxide quality is 
much better when a dry 
ambient is used
• Thick isolation layers are 
therefore formed using wet
therefore formed using wet 
oxidation, while MOSFET gate 
oxides are formed with dry 
oxidation

61

Local Oxidation
• The presence of another 
material such as silicon nitride 
( 3N4) on the surface inhibits 
(Si ) h f hb
the growth of oxide in that 
region
g
• This allows selective or local 
oxidation of the substrate 
surface ‐will be used to isolate 
devices or conductive layers
• Some oxidation does occur 
Some oxidation does occur
laterally under the nitride 
layer, giving rise to the bird’s 
beak effect

62
Doping : 1- By Diffusion
• Dopant can be introduced into the 
substrate through diffusion
• Diffusion is a general physical process 
which drives particles down a 
concentration gradient
concentration gradient
• The substrate is heated in the 
presence of dopant atoms, which 
then diffuse into the substrate
• Diffusion may also occur into other 
l
layers which are present such as 
hi h t h
silicon dioxide
• Large amount of lateral diffusion also 
Large amount of lateral diffusion also
occurs
63

Doping : 2- By Ion Implantation


• In ion implantation, dopant 
atoms are accelerated toward
atoms are accelerated toward 
the substrate surface and 
enter due to their kinetic
enter due to their kinetic 
energy
• This is the preferred 
This is the preferred
technique for introduction of 
dopant atoms since the
dopant atoms since the 
amount of lateral diffusion is 
much lower
much lower
64
Ion Implantation –Predep and Drive-in
• Ion implantation can be used 
to form a deep region of 
doping using a two step 
procedure
– A
A high concentration of dopant 
high concentration of dopant
is deposited near the surface in 
the pre‐deposition or predep 
stage
g
– The dopant source is then 
removed and the wafer heated 
to cause redistribution of the
to cause redistribution of the 
dopant via diffusion in the 
drive‐in stage

65

Deposition
• Layers of materials such as metal (and in some cases silicon 
dioxide) may need to be formed on the surface
• General procedure of forming a layer of material on the 
General procedure of forming a layer of material on the
surface is termed deposition
• Two types can be identified, physical and chemical
yp ,p y
– In physical deposition, a piece (target) of the material to be 
deposited is bombarded with ions, ejecting atoms of 
material which then adhere to the substrate surface
– Chemical deposition uses an ongoing chemical reaction to 
form the desired material as a precipitate on the substrate
form the desired material as a precipitate on the substrate 
surface
• A specialized form of deposition is epitaxy, the formation of a 
p p p y
layer of crystalline semiconductor material
66
Patterning
• The use of a series of PR deposition, exposure, 
p g g
development and etching to create regions of 
particular shape is called patterning
• For example, if a newly deposited metal layer was 
p , y p y
coated with PR, exposed using a mask, developed 
and etched using a method which selectively 
removed the metal not covered by the PR, this 
would be referred to as “patterning” the metal 
• There will be many individual patterning steps in 
the creation of a useful integrated structure

67

Mask Generation - Reticle


• The geometry information over 
the entire IC required for a 
particular photolithography 
l h l h h
step is used to create a reticle 
(
(Mask like grid), a 10X sized 
g ),
optical plate.
• There can be anywhere from 6 
to 24+ individual 
photolithography steps in a 
manufacturing process, each
manufacturing process, each 
with its own set of geometrical 
information captured in a 
reticle.
ti l

68
Mask Generation -Step and Repeat
• In order to fabricate many devices simultaneously, the reticle 
information is reduced and projected many times onto a 1X 
mask using a step and repeat process.
k d

69

Mask Generation -Final 1X Mask


• The 1X mask which results 
p p
from the step and repeat 
process contains all the 
information for a particular 
photolith step for all chips 
which will be fabricated on the 
wafer.
• This image is projected during 
the exposure step to cause PR 
chemical changes in the 
appropriate locations.
l
70
Example Simple Mask Set
• Shown below is a highly simplified layout for a two transistor 
digital gate, and the masks which would be required based on its 
l
layout (see MOSFET).
t( MOSFET)
• Not in notes, just shown as an example of how masks are 
derived from a user‐generated layout.
g y

71

Final Fabricated Wafer

72
Scribing and Cleaving
• After processing is finished, the wafers are separated into 
individual dice by scribing and cleaving
– Scribing refers to creating a groove along scribe channels 
which have been left between the rows and columns of 
individual chips (during mask generation)
individual chips (during mask generation)
– Cleaving is the process of breaking the wafer apart into 
individual dice

73

Packaging

74
IC Packages

75

Lecture-Set 4
CMOS Fabrication

76
CMOS Fabrication
Fabrication processes
• IC built on silicon substrate:
– some structures diffused into substrate;
– other structures built on top of substrate.
• Substrate regions are doped with n‐type and p‐type impurities. 
(n+ heavily doped)
(n+ = heavily doped)
• Wires made of polycrystalline silicon (poly), multiple layers of 
(
aluminum (metal). )
• Silicon dioxide (SiO2) is insulator.

77

Inverter Mask Set


• Transistors and wires are defined by masks
• Cross‐section taken along dashed line

78
Detailed Mask Views
• Six masks
– n‐well
n well
– Polysilicon
– n+ diffusion
n+ diffusion
– p+ diffusion
– Contact
– Metal

79

Fabrication
• Chips are built in huge factories called fabs
• Contain clean rooms as large as football fields
Contain clean rooms as large as football fields

Courtesy of International
Business Machines Corporation.
Unauthorized use not permitted.

80
Fabrication Steps
• Start with blank wafer
• Build inverter from the bottom up
Build inverter from the bottom up
• First step will be to form the n‐well
– Cover wafer with protective layer of SiO2 (oxide)
– Remove layer where n‐well should be built
– Implant or diffuse n dopants into exposed wafer
– Strip off SiO2

p substrate

81

Oxidation
• Grow SiO2 on top of Si wafer
– 900 
900 – 1200 C with H
1200 C with H2O or O
O or O2 in oxidation furnace
in oxidation furnace

SiO2

p substrate

82
Photoresist
• Spin on photoresist
– Photoresist is a light
Photoresist is a light‐sensitive
sensitive organic polymer
organic polymer
– Softens where exposed to light

Photoresist
SiO2

p substrate

83

Lithography
• Expose photoresist through n‐well mask
• Strip off exposed photoresist
Strip off exposed photoresist

Photoresist
SiO2

p substrate

84
Etch
• Etch oxide with hydrofluoric acid (HF)
– Seeps through skin and eats bone; nasty stuff!!!
Seeps through skin and eats bone; nasty stuff!!!
• Only attacks oxide where resist has been 
exposed

Photoresist
SiO2

p substrate

85

Strip Photoresist
• Strip off remaining photoresist
– Use mixture of acids called piranah etch
Use mixture of acids called piranah etch
• Necessary so resist doesn’t melt in next step

SiO2

p substrate

86
n-well
• n‐well is formed with diffusion or ion implantation
• Diffusion
iffusion
– Place wafer in furnace with arsenic gas
– Heat until As atoms diffuse into exposed Si
Heat until As atoms diffuse into exposed Si
• Ion Implantation
– Blast wafer with beam of As ions
Blast wafer with beam of As ions
– Ions blocked by SiO2, only enter exposed Si
SiO2

n well

87

Strip Oxide
• Strip off the remaining oxide using HF
• Back to bare wafer with n‐well
Back to bare wafer with n well
• Subsequent steps involve similar series of steps

n well
p substrate

88
Polysilicon
• Deposit very thin layer of gate oxide
– < 20 Å
< 20 Å (6
(6‐7
7 atomic layers)
atomic layers)
• Chemical Vapor Deposition (CVD) of silicon layer
– Place wafer in furnace with Silane gas (SiH
Pl f i f ith Sil (SiH4)
– Forms many small crystals called polysilicon
– Heavily doped to be good conductor
H il d d b d d

89

Polysilicon Patterning
• Use same lithography process to pattern 
polysilicon

Polysilicon

90
Self-Aligned Process
• Use oxide and masking to expose where n+ 
dopants should be diffused or implanted
dopants should be diffused or implanted
• N‐diffusion forms nMOS source, drain, and n‐
well contact
well contact

91

N-diffusion
• Pattern oxide and form n+ regions
• Self aligned process where
Self‐aligned process where gate blocks diffusion
gate blocks diffusion
• Polysilicon is better than metal for self‐aligned gates 
because it doesn’t melt during later processing
g p g

n well
p substrate

92
N-diffusion cont.
• Historically dopants were diffused
• Usually ion implantation today
Usually ion implantation today
• But regions are still called diffusion

93

N-diffusion cont.
• Strip off oxide to complete patterning step

94
P-Diffusion
• Similar set of steps form p+ diffusion regions for 
pMOS source and drain and substrate contact
pMOS source and drain and substrate contact

p+ Diffusion

95

Contacts
• Now we need to wire together the devices
• Cover chip with thick field oxide
Cover chip with thick field oxide
• Etch oxide where contact cuts are needed

Contact

Thick field oxide


p+ n+ n+ p+ p+ n+
n well
p substrate

96
Metalization
• Sputter on aluminum over whole wafer
• Pattern to remove excess metal, leaving wires
Pattern to remove excess metal leaving wires

M e ta l

97

Lecture-Set 5
Layout and Design Rules

98
Layout

• Chips
Chips are specified with set of masks
are specified with set of masks
• Minimum dimensions of masks determine transistor 
size (and hence speed cost and power)
size (and hence speed, cost, and power)
• Feature size f = distance between source and drain
– Set by minimum width of polysilicon
Set by minimum width of polysilicon

99

Layout
• Feature size improves 30% every 3 years or so normalize for 
feature size when describing design rules
• Express rules in terms of λ = f/2
– E.g. λ = 0.3 μm in 0.6 μm process

100
Layout CAD Tools Example 1 : Ledit

101

CMOS Inverter Layout


Example 2: Electric Layout Editor

103

Example 3: Cadence Virtuoso

104
Design Rules
• The main objective of the layout rules is to build 
reliably functional circuits in as small an area as 
possible. Or, 
ibl O to ensure that design works even when 
t th t d i k h
small fab errors (within some tolerance) occur.
• Interface between the circuit designer and process 
Interface between the circuit designer and process
engineer
• The Rule file contains:
– Unit dimension: minimum line width, spacing
• scalable design rules: lambda parameter
• absolute dimensions: micron rules
absolute dimensions: micron rules
– A complete set includes
• Set of layers
• intra‐layer:  relations between objects in the same layer
• inter‐layer:  relations between objects on different layers
105

Well Rules
• The n‐well is usually a deeper 
implant (especially a deep n‐
well) than the transistor 
source/drain implants. Hence, 
i i
it is necessary to provide 
id
sufficient clearance between 
the n‐well edges and the
the n‐well edges and the 
adjacent n+ diffusions.
• The masks encountered for 
The masks encountered for
well specification may include 
n‐well, p‐well, and deep n‐
well. 
106
Transistor Rules
• CMOS transistors are generally defined by at 
least four physical masks.
four physical masks.
– active (also called diffusion, diff, thinox, OD, or RX)
• defines
defines all areas where either n
all areas where either n‐ or p type diffusion is to 
or p type diffusion is to
be placed.
– n‐select (also called n‐implant, nimp, or nplus)
• n‐select surrounds active regions where n‐type diffusion 
is required. 
– p‐select (also called p‐implant, pimp, or pplus)
( )
• p‐select surrounds areas where p‐type diffusion is 
required.
required

107

Transistor Rules …. Cont.


– Polysilicon (also called poly, polyg, PO, or PC).
– The gates of transistors are defined by the logical 
The gates of transistors are defined by the logical
AND of the polysilicon mask and the active mask.
– p‐diffusion areas inside n‐wells define pMOS
p p
transistors (or p‐diffusion wires).
– p‐diffusion areas inside p‐wells define substrate 
contacts (or p‐well contact).
• Sometimes, design systems will define only n‐diffusion 
(ndiff ) and p‐diffusion (pdiff ) to reduce the complexity 
of the process. ndiff will be converted automatically 
i t
into active with an overlapping rectangle or polygon of 
ti ith l i t l l f
n‐select.
108
Transistor Rules …. Cont.
• It is essential for the poly to cross active 
completely; otherwise the transistor that has 
been created will be shorted by a diffusion 
path between source and drain.

Transistors

Catastrop
hic error

109

Contact Rules
• Contacts are normally of uniform size to allow 
for consistent etching of very small features.
for consistent etching of very small features.
• Poly Contact:
– To connect the ploy to Metal 1 Layers.
To connect the ploy to Metal 1 Layers
• Active Contact:
– To connect Metal 1 to active area (p‐diff, n‐diff).

110
Metal and Via Rules
• Metal spacing may vary with the width of the 
metal line. (above some metal wire width, the
metal line. (above some metal wire width, the 
minimum spacing may be increased).
• There may also be maximum metal width rules. 
There may also be maximum metal width rules
• Each technology contains more than one metal 
layer.
• Via is to connect two successive metal layers. 
(Via 1 is to connect M1 to M2, V2 Æ M2 to M3 , 
etc…)
• Via rules are similar to contact rules
111

MOSIS Scalable CMOS Design Rules


• Lambda‐based scalable CMOS design rules from 
MOSIS.
• Dimension and spacing are given in terms of 
Lambda.
Lambda
• Lambda=feature size/2;  Lambda=0.3 for 0.6um 
t h l
technology node.
d

112
Intra-Layer Design Rules

Same Potential Different Potential

9 2
0
Well or Polysilicon
6
10 2
3 3
Active Metal1
Co c
Contact
or Via 2
3 Hole 3
2 2
Metal2 4
Select

113

Transistor Layout
Transistor

1
T

3 2

114
Design Rules
Example: Example:
MOSIS SCMOS MOSIS SCMOS
design rules
• Designed to scale across a wide range of 
technologies.
• Designed to support multiple vendors.
• Designed for educational use.
D i df d ti l

115

SCMOS Layout Rules - Well

116
SCMOS Layout Rules - Active

Full Rules
117

Micron Design Rules


Micron design rules for 65 nm process

118
Micron Design Rules ..Cont.

119

NMOS Layout : (mhp_n05 Tech.)


• NMOSFET
– N‐Select
– Active
– Poly
– P‐Select
– Active
– Poly Contact
– Active Contact
Active Contact
– Metal 1 (M1)

120
MOSFET Layout: Example 2
• PMOSFET
– N‐Well
– P‐Select
– Active
– Poly
– N‐Select
N Select
– Active
– Poly Contact
Poly Contact
– Active Contact
– Metal 1 (M1)
Metal 1 (M1)

121

Bipolar Transistors

122
Circuit Passive Elements : Resistor
• Resistor:
– Resistor  is a piece of materials 
Resistor is a piece of materials
(Ndiff, Pdiff, N‐Well, etc..) with 
a specific resistivity in Ohm/sq.
– Resistivity can be 
ranged between
100 ‐1000 Ohm/sq.
– Resistance of a square ‐ of any 
area ‐ is constant:

123

Circuit Passive Elements: Resistor


18um
3um Poly
R=(18/3)x4=24 Ohm
15um
3um

Resistivity is 4 Ohms/sq 15u


m
R=(12/3+0.5+12/3)x4=34 Ohm

124
Circuit Passive Elements: Resistor
• Resistor Layout
– Using dummy fingers to 
Using dummy fingers to
guarantee the integrity of 
the layout. (decrease the 
effect of graded side 
diffusion. 

125

Circuit Passive Elements


• Capacitors:
– In technologies that provides two poly layers. 
In technologies that provides two poly layers
– PIP Cap:‐ poly‐insulator‐poly (PIP) capacitor , a thin 
oxide was placed between the two polysilicon layers
oxide was placed between the two polysilicon layers 
to achieve capacitance of approximately 1 fF/ um
2 .

126
Capacitor Layout
• Example:  It is required to build a 180fF 
capacitor using PIP technique given that For the
capacitor using PIP technique given that For the 
adopted process the cap/unit area is  800 aF/µm 
2 . (1 attoFarad is 10‐18
2 . (1 attoFarad is 10 18 F). 
F).
• Solution:
• Calculate the area of the capacitor (i.e. the area 
C l l t th f th it (i th
of overlap of poly and electrode)
• Area = 180fF/ 800aF = 225 square microns.

127

Capacitor Layout
• Example:  It is required to build a 180fF 
capacitor using PIP technique given that For the
capacitor using PIP technique given that For the 
adopted process the cap/unit area is  800 aF/µm 
2 . (1 attoFarad is 10‐18
2 . (1 attoFarad is 10 18 F). 
F).
• Solution:
• Calculate the area of the capacitor (i.e. the area 
C l l t th f th it (i th
of overlap of poly and electrode)
• Area = 180fF/ 800aF = 225 square microns.

128
Capacitor Layout.. Cont
• Assume that the 
capacitor will be
capacitor will be 
built as a perfect 
square, the
square, the 
length and the 
width would be
width would be 
15 microns each 
and terminals
and terminals 
are M1 layer

129

Circuit Passive Elements


• Capacitors:
– MIM Cap (Similar to PIP Cap)
MIM Cap (Similar to PIP Cap)
• Repeat the last example given 
that M1‐M2 Unit capacitance is 
500 af/sq.um and the terminals 
are connected to M3
– Fringe capacitor
Fringe capacitor

Fringe capacitor

130
Circuit Passive Elements
• Inductor:

Typical spiral inductor


and equivalent circuit

131

Full Chip Layout Cadence Virtuoso Layout Editor

132
Die Micrograph

133

Testing process

134
Testing process (Cont.)

135

Lecture-Set 5
MOS Transistor (Review)

136
Pros of MOSFETs
• Size (smaller)
• Ease of manufacture
Ease of manufacture
• Lesser power utilization
• MOSFET technology
– It allows placement of approximately 2 billion 
transistors on a single IC.
• backbone of very large scale integration (VLSI)
– It is considered preferable over BJT technology for 
many applications.

137

Device Structure and Operation

Physical structure of the enhancement-type NMOS transistor:


(a) perspective view, (b) cross-section. Note that typically L =
0.03um
0 03u to 1um,
u ,W=0 0.1um
u to 100um,
00u , and
a d tthee tthickness
c ess oof tthe
e
oxide layer (tox) is in the range of 1 to 10nm.
138
Operation with Zero Gate Voltage
• With zero voltage applied 
to gate, two back‐to‐back 
diodes exist in series 
between drain and source.
• “They” prevent current 
conduction from drain to 
source when a voltage v
h lt DS
is applied.
– yielding very high 
yielding very high
resistance (1012ohms)

139

Creating a Channel Between Source and Drain

• Q: What happens if  source, drain  and 
body are grounded  while positive 
voltage is applied to gate?
voltage is applied to gate?  
• A:
– step #1: vGS is applied to the gate 
terminal causing a gate terminal to
terminal, causing a gate terminal to 
be positively charged .
– step #2: This “positive charge ” 
causes free holes to be repelled
causes free holes to be repelled
from region of p‐type substrate 
under gate. This “migration of 
holes” results in the uncovering of 
g
negative bound charges, originally 
neutralized by the free holes. This 
creates a depletion region indicated 
by white color

140
Creating a Channel Between Source and Drain

• step #3: Further increasing 
of the positive gate voltage
of the positive gate voltage 
attracts electrons resulting 
in inverting the layer under
in inverting the layer under 
the oxide (p‐type becomes 
n‐type)
n type) and form an n
and form an n‐type
type  
channel between the 
source and the drain. The
source and the drain. The 
channel indicated by the 
gray color.
gray color.

141

Creating a Channel Between Source and Drain

• The amount of vGS required to just form the channel is


called Vt ((threshold voltage).
g )
• Vt is characteristic quantity (mainly) determined during
the fabrication of the device. The value is around (0.3
~1.0) V.
• The excess of vGS over Vt is termed the effective
voltage or the overdrive voltage and is the quantity
that determines the charge in the channel.
• When v
Wh DS=0 0, the  
h horizontal
h i l voltage
l at every point 
i
along the channel is zero, and the voltage across the 
oxide is vOV=vGS-V
oxide is  Vt.

142
Creating a Channel Between Source and Drain

• The gate and the channel region of the MOSFET


form a parallel
parallel-plate
plate capacitor, with the oxide
layer acting as the capacitor dielectric.
• The amount of the channel charge is 
The amount of the channel charge is
determined by vOV.

• Cox is called the oxide capacitance (gate


capacitance) per unit gate area , W and L are
the width and the length of the channel.

143

Creating a Channel Between Source and Drain

• Cox is given by:

• tox (The oxide thickness) is determined by the process


technology used to fabricate the MOSFET.
• As an example for a process with tox=4 nm

144
Applying a Small vDS
• For small vDS, a current 
will flow from drain to
will flow from drain  to 
source.

• MOSFET is act as a
controlled resistor, The
resistance
it can be
b given
i by:
b

145

Derivation of the current equations


iDS=Q(per unit length).charge velocity.=(Q/L).v
v=μE E=vDS/L,
v=μE, /L Q/L=CoxWvOV
iDS=CoxWvOV μvDS/L= μ Cox (W/L)vOVvDS
=μ Cox (W/L)(vGS-Vt)vDS
the process transconductance parameter

and the MOSFET transconductance parameter


k n,

146
Derivation of current equations .. Cont.
gDS=iDS/vDS

μ Cox ((W/L)(v
)( GS-Vt)).

rDS=1/gDS
The resistance for
vGS = Vt
is infinite and becomes
smaller as the gate to
g
source voltage
increases.
147

Operation as vDS is Increased

148
I-V Relations for vDS < VOV

149

Operation for vDS > VOV

The zero depth of the channel


at the drain end gives rise to
the term channel pinch-off at
vDS=VOV
O E
Or Equivalently
i l tl vGD=V Vt
Increasing vDS beyond this
value has no effect on the
channel shape and charge and
hence iDS

150
Conclusion of Regions of Operation
• vGS < Vt Æ iD =0 device is in cut off region
Vt Æ Device is in Triode
• vGS > Vt , vDS <vGS-V
region and

• vGS > Vt , vDS >= vGS-Vt Æ Device is in


g
Saturation region and

151

Example
• Consider a process technology for which Lmin = 
0.4 μm, tox = 8 nm, μn 
0.4 μm, t 450 cm2/V s, and V
8 nm, μn = 450 cm s, and Vt = 
0.7 V.
(a) Find C
(a) Find Cox and .
and
(b) For a MOSFET with W/L= 8 μm/0.8 μm, calculate the 
values of VOV, V
values of V , VGS, and V
, and VDSmin needed to operate the 
needed to operate the
transistor in the saturation region with a dc current ID = 
100 μA.
(c) For the device in (b), find the values of VOV and VGS
required to cause the device to operate as a 1000 Ohm 
resistor for very small vDS.
152
Solution
(a)

(b) For operation in the saturation region,

153

Solution Cont….

154
p-Channel MOSFET
• All semiconductor regions are reversed in 
polarity relative to their counterparts in the 
NMOS case.
• The substrate is n type and the source and the 
yp
drain regions are p+ type. 

155

p-Channel MOSFET
• To induce a channel for current flow between 
source and drain, a negative vGS < Vtp
source and drain, a negative v t is applied. 
is applied.
Vtp is negative value
• To avoid dealing with negative sign, we can 
To avoid dealing with negative sign we can
write: |vGS|>|Vtp|
• transconductance parameter for the PMOS 
t d t t f th PMOS
device is :k′p = μpCox  , 
• The transistor transconductance parameter kp 
is                 kp = k′p(W⁄L)

156
Complementary MOS or CMOS
• Composed of N and P MOS Transistors. CMOS is 
the most widely used of all the IC technologies
the most widely used of all the IC technologies 

157

N and P MOSFET Circuit Symbol

NMOS

(d)

PMOS

(d)

158
The iD–vDS Characteristics

159

PSPICE Simulation

160
The iD– vGS Characteristic
i saturation
in t ti theth drain
d i currentt
is constant determined by
vGS-Vtn ((vOV)

161

PSPICE Simulation

162
Example

163

Solution Cont..

164
Solution Cont..

165

Solution Cont..

166
Finite Output Resistance in Saturation
• Increasing vDS beyond vDSsat causes the channel pinch‐off point to move 
slightly away from the drain, thus reducing the effective channel length (by 
ΔL). a phenomenon known as channel‐length modulation.

167

Characteristics of the p-Channel MOSFET

168
MOS Capacitance
• Any two conductors separated by an insulator 
have capacitance
have capacitance
• Gate to channel capacitor is very important
– Creates channel charge necessary for operation
Creates channel charge necessary for operation
• Source and drain have capacitance to body
– Across reverse‐biased diodes
– Called diffusion capacitance because it is associated 
with source/drain diffusion
h /d d ff

169

Dynamic Behavior of MOS Transistor

CGS CGD

S D

CSB CGB CDB

170
Gate Capacitance
• Approximate channel 
as connected to source Polysilicon gate
• Cgs = εoxWL/tox = CoxWL
= CppermicronW
Source
S D i
Drain
• Cpermicron is typically  W +
n+ xd xd n
about 2 fF/μm 
Ld Gate-bulk
Gate bulk
overlap
Top view
Gate oxide
tox
n+ L n+
Cross section
171

Gate Capacitance

G G G

CGC CGC CGC


S D S D S D

Cut-off Resistive Saturation

Most important regions in digital design: saturation and cut-off

172
Depletion (Junction) Capacitance

Channel-stop implant
NA 1

Side wall
Source
W
ND

Bottom

xj Side wall
Channel
LS Substrate N A

173

Junction Capacitance

174
Junction (Dep) Capacitance
• Csb, Cdb
• Undesirable, called parasitic
Undesirable called parasitic capacitance
• Capacitance depends on area and perimeter
– Use small diffusion nodes
– Comparable to Cg
for contacted diff
– ½ Cg for uncontacted
– Varies with process

175

Body Effect

• Body is a fourth transistor terminal
• Vsb affects the charge required to invert the channel
– Increasing Vs or decreasing Vb increases Vt

Vt = Vt 0 + γ ( φs + Vsb − φs )
• φs = surface potential at threshold
NA
φs = 2vT ln
ni
– Depends on doping level NA
– And intrinsic carrier concentration ni
• γ = body effect coefficient
tox 2qε si N A
γ = 2qε si N A =
ε ox Cox

176
Leakage
• What about current in cutoff?
• Simulated results
Simulated results
• What differs?
– Current doesn’t
go to 0 in cutoff

177

Leakage Sources
• Subthreshold conduction
– Transistors can
Transistors can’tt abruptly turn ON or OFF
abruptly turn ON or OFF
– Dominant source in contemporary transistors
• Gate leakage
Gate leakage
– Tunneling through ultrathin gate dielectric
• Junction leakage
– Reverse‐biased PN junction diode current

178
Lecture_Set
Lecture Set 7
CMOS Logic Gates

179

CMOS Circuit Styles


• Static
Static complementary
complementary CMOS ‐CMOS except during switching, output 
except during switching output
connected to either VDD or GND via a low‐resistance path
– high noise margins
• full rail to rail swing
full rail to rail swing
• VOH and VOL are at VDD and GND, respectively
– low output impedance, high input impedance
– no steady state path between V
d hb DD and GND (no
d ( static 
power consumption)
– delay is a function of load capacitance and transistor 
resistance
– comparable rise and fall times (under the appropriate 
transistor sizing conditions)
• Dynamic CMOS ‐
CMOS relies on temporary storage of signal values 
relies on temporary storage of signal values
on the capacitance of high‐impedance circuit nodes
– simpler, faster gates
– increased sensitivity to noise
increased sensitivity to noise
Transistors as Switches
• We can view MOS transistors as electrically 
controlled switches
controlled switches
• Voltage at gate controls path from source to 
drain
g=0 g=1

d d d
nMOS g OFF
ON
s s s

d d d

pMOS g OFF
ON
s s s

181

Static Complementary CMOS


‰ Pull-up network (PUN)
( ) and pull-down network (PDN)
( )

VDD
PMOS transistors only
In1
pull-up: make a connection from VDD to F
In2 PUN
when F(In1,In
In2,…In
InN) = 1
InN
F(In1,In2,…InN)
In1
pull-down: make a connection from F to
In2 PDN
GND when F(In1,In2,…InN) = 0
InN
NMOS transistors only
y

PUN and
d PDN are d
duall llogic
i networks
t k

182
Dual PUN and PDN
• PUN and PDN are dual networks
– a parallel connection of transistors in the PUN 
corresponds to a series connection of the PDN
• Complementary gate is naturally inverting (NAND, NOR, 
AOI, OAI))
• Number of transistors for an N‐input logic gate is 2N

Threshold Drops
VDD VDD
PUN

VDD

0→ 0→

CL CL

PDN VDD → VDD →

CL CL
VDD

184
Threshold Drops
VDD VDD
PUN
S D
VDD

D 0 → VDD VGS S 0 → VDD - VTn

CL CL

PDN VDD → 0 VDD → |VTp|


VGS
D CL S CL
VDD

S D

185

Construction of PDN
• NMOS devices in series implement a NAND 
function
A•B
A

• NMOS devices in parallel implement a NOR 
function
A+B
A B

186
CMOS Inverter

A Y
0 1
1 0 OFF
ON
0
1

OFF
ON

A Y

187

CMOS NAND Gate

A B Y
0 0 1 ON
OFF
OFF
ON OFF
ON
0 1 1
1
0
1 0 1 ON
OFF
1 1 0 0
1
1
0 OFF
ON
OFF
ON

188
CMOS NOR Gate

A B Y
0 0 1 A
0 1 0
1 0 0 B
1 1 0 Y

189

3-input NAND Gate


• Y pulls low if ALL inputs are 1
• Y pulls high if ANY input is 0
Y pulls high if ANY input is 0

Y
A
B
C

190
Compound Gates
AND-OR-INVERT-22, or AOI22

PDN Combine both Parts

PUN

191

Example 2
• Sketch a transistor‐level schematic for a 
compound CMOS logic gate of the following
compound CMOS logic gate  of the following 
function.

192
Solution of Example 1 in Class

193

Lecture 9
Synthesis of Complex CMOS Gate

194
Example 1
• Construct the schematic diagram of the 
following logic function using static CMOS
following logic function using static CMOS 
design topology. 

Y = D + A.(( B + C )

195

Synthesis of Complex CMOS Gate

B
A
C

D
OUT = !(D + A • (B + C))
A
D
B C

196
Standard Cell Layout Methodology

Routing
channel VDD

signals

GND

197

Standard Cell Layout Methodology

Routing
channel VDD

signals

GND

What logic function is this?

198
Standard Cell Layout Methodology
‰ Logic Graph
‰ A graph represents the connection of PUN (PMOS)
transistors from VDD to OUTPUT and its dual PDN
(NMOS) transistor between OUTPUT and GND.
‰ Example on 2-input NAND and NOR gates

199

Area Efficient Layout


‰ An uninterrupted diffusion strip is possible only
if there exists a Euler p
path in the logic
g graph
g p
‰ Euler path: a path through all nodes in the graph such
that each edge is visited once and only once.
‰ For a single poly strip for every input signal, the
Euler paths in the PUN and PDN must be consistent
consistent.

200
Stick Diagram
• Represents relative positions of transistors
• Contains no dimensions
• Used as a guide line for Layout
• Constructed based on Euler path

201

Example on 2-inputs NAND and NOR gates

202
Example 2: OAI21 Logic Graph

X PUN
A
j C
B C

X i VDD
X = !(C • (A + B))
C
i B j A

A B
PDN
A GND
B
C

203

Two Stick Layouts of !(C • (A + B))

A B C

VDD

GND

uninterrupted diffusion strip

204
Example 2: OAI22 Logic Graph

X PUN
A C

B D D C

X VDD
X = !((A+B)•(C+D))

C D
B A

A B PDN
A GND
B
C
D

205

OAI22 Layout
A B D C

VDD

GND

‰ Some functions have no consistent Euler path like


x = !(a + bc + de) (but x = !(bc + a + de) does!)

206
Example 3:

For the following Function:

Y = AB + E + CD

• Implement
p the function using
g the static CMOS
topology
• Draw the logic graph of the circuit
• Write the possible consistent Euler path if any
• Draw the stick diagram of the circuit.
• Sketch the layout

207

Solution in Class

208
Assignment
• For the following functions 
– F=abc!
– F=(a+b+c)!
– FF=(abc
(abc + bd)!
bd)!
• Implement the function using the static CMOS topology
• Draw the logic graph of the circuit
• Wi h
Write the possible consistent Euler paths if any 
ibl i l h if
• Draw the stick diagram of the circuit.
• Sketch the layout

209

Lecture 7
DC & Transient Response

210
DC Response
• DC Response: Vout vs. Vin for a gate
• Ex: Inverter
Ex: Inverter
– When Vin = 0   ‐>  Vout = VDD
– When V
Wh Vin = V VDD ‐> 
> Vout = 0
0
VDD
– In between, Vout depends on
transistor size and current
i i d Idsp
Vin Vout
– By KCL, must settle such that Idsn

Idsn = |Idsp|
– We could solve equations
– But graphical solution gives more insight
211

Transistor Operation
• Current depends on region of transistor 
behavior
• For what Vin and Vout are nMOS and pMOS in
– Cutoff?
– Linear?
– Saturation?
S t ti ?

212
nMOS Operation

Cutoff Linear Saturated


Vgsn < Vtn Vgsn > Vtn Vgsn > Vtn
Vin < Vtn Vin > Vtn Vin > Vtn
Vdsn < Vgsn – Vtn Vdsn > Vgsn – Vtn
Vout < Vin - Vtn Vout > Vin - Vtn

VDD
Vgsn = Vin Idsp
Vin Vout
Vdsn = Vout Idsn

213

pMOS Operation

Cutoff Linear Saturated


Vgsp > Vtp Vgsp < Vtp Vgsp < Vtp
Vin > VDD + Vtp Vin < VDD + Vtpp Vin < VDD + Vtpp
Vdsp > Vgsp – Vtp Vdsp < Vgsp – Vtp
Vout > Vin - Vtp Vout < Vin - Vtp

VDD
Vgsp = Vin - VDD Vtp < 0 Idsp
Vin Vout
Vdsp = Vout - VDD Idsn

214
Operating Regions

• Revisit transistor operating regions

Region nMOS pMOS


A Cutoff Linear
VDD
B Saturation Linear A B

C Saturation Saturation Vout


C
D Linear Saturation
E Linear Cutoff
D
E
0 Vtn VDD/2 VDD+Vtp
VDD
Vin

215

Noise Margins
• How much noise can a gate input see before it 
does not recognize the input?
does not recognize the input?

Output Characteristics Input Characteristics


VDD
Logical High
Output
p Range g VOH Logical High
Input Range
NMH
VIH
Indeterminate
VIL Region
NML
Logical Low
Logical Low VOL Input Range
Output Range
GND

216
Beta Ratio
• If κp / κn ≠ 1, switching point will move from 
VDD/2
• Called skewed gate
• Other gates: collapse into equivalent inverter
Oth t ll i t i l ti t
VDD
βp
= 10
βn
Vout 2
1
0.5
βp
= 0.1
βn

0
VDD
Vin
217

Noise Margins
‰ For robust circuits,
circuits want the “0”
0 and “1”1 intervals to be as
large as possible

VDD VDD

VOH "1"
NMH = VOH - VIH
Noise Margin High VIH
Undefined
Noise Margin
g Low Region
VIL
NML = VIL - VOL
VOL
"0"
Gnd
G G
Gnd
Gate Output Gate Input

‰ Large noise margins are desirable


Fan-In and Fan-Out
‰ Fan-out – number of load gates
connected to the output of the
d i i gate
driving t
z gates with large fan-out are slower
N

‰ Fan-in – the number of inputs to


the gate M
z gates with large fan-in are bigger
and slower

The Ideal Inverter


• The ideal gate should have
Th id l t h ld h
– infinite gain in the transition region
– a gate threshold located in the middle of the logic swing
– high and low noise margins equal to half the swing
– input and output impedances of infinity and zero, resp.
Vou
t

Ri = ∞

Ro = 0

Fanout = ∞

NMH = NML =
Vin VDD/2
The Ideal Inverter
• The ideal gate should have
Th id l t h ld h
– infinite gain in the transition region
– a gate threshold located in the middle of the logic swing
– high and low noise margins equal to half the swing
– input and output impedances of infinity and zero, resp.
Vou
t

Ri = ∞

Ro = 0
g=-∞
Fanout = ∞

NMH = NML =
Vin VDD/2

Transient Response
• Transient Response of Digital Circuit

222
Inverter Dynamic Behavior

• Performance of CMOS Inverter: The 
D namic Beha ior
Dynamic Behavior
– Computing the Capacitances
– Propagation Delay: First‐Order Analysis
– Propagation Delay from a Design Perspective
Propagation Delay from a Design Perspective

223

Delay Definitions
• Propagation delay time, tpd =
maximum time from the input Vin Vout
crossing
i 50% tto th the output
t t
crossing 50%
• Contamination delay time,
tcd = minimum
i i titime ffrom th
the
input crossing 50% to the
output crossing 50%
• Rise time, tr = time for a
waveform to rise from 20% to
80% of its steady-state value
• Fall time, tf = time for a
waveform to fall from 80% to
20% of its steady-state value
• Edge rate, trf = (tr + tf )/2

224
Definitions
• Driver:‐ The gate that charges or discharges a 
node.
• Load:‐ The gates and wire being driven
• Arrival time:‐
A i l ti th l t t ti
the latest time at which each 
t hi h h
node in a block of logic will switch.
• The arrival time ai at node i depends on the 
propagation delay of the gate driving i and the 
arrival times of the inputs to the gate:

225

Arrival time Example

226
Timing Optimization
• In a design , there are two types of paths
– Non‐critical path :‐
Non critical path : do not require any conscious effort when 
it comes to speed. Usually, they are many.
– Critical Path :‐ that limit the operating speed of the system 
and require attention to timing details.
d i i i i d il
• The critical paths can be affected at four main 
l l
levels:
– The architectural/microarchitectural level
– Th l i l l
The logic level
– The circuit level
– The layout level
The layout level

227

Arrival time Example

Identifyy the critical path?


p

228
Transient Response
• How to compute the delay time:
– A fundamental way to compute delay is to develop a 
A fundamental way to compute delay is to develop a
model of the circuit under study by  writing a 
differential equation relating the output to the input 
voltage and time. The solution is called transient 
response.
– If capacitance C is charged with a current I, the 
voltage on the capacitor varies as:

229

Load Capacitors

230
Modeling Propagation Delay
• Model circuit as first‐order RC network

vout (t) = (1 – e–t/τ)V


R where τ = RC
vou
t
C
Time to reach 50% point is
vin t = ln(2) τ = 0.69
0 69 τ

Time to reach 90% p point is


t = ln(9) τ = 2.2 τ

¾ Matches the delay of an inverter gate

231

Inverter Propagation Delay


• Propagation
i delay
d l is
i proportional
i l to the
h time-constant
i off the
h
network formed by the pull-down resistor and the load capacitance
VDD tpHL = f(Rn, CL)

tpHL = ln(2) Reqn CL = 0.69 Reqn CL


tpLH = ln(2) Reqp CL = 0.69 Reqp CL
Vout = 0

Rn CL tp = (tpHL + tpLH)/2 = 0.69


0 69 CL(Reqn + Reqp)/2

Vin = V DD

• To equalize rise and fall times make the on-resistance of the


NMOS and PMOS approximately equal.

232
Inverter Transient Response
3
VDD=2.5V
Vin 0.25μm
2.5 W/Ln = 1.5
W/Lp = 4.5
2 Reqn= 13 kΩ (÷ 1.5)
1.5 Reqp= 31 kΩ (÷ 4.5)

0.5

-0.5
0 0.5 1 1.5 2 2.5
x 10-10
t (sec)

233

Inverter Transient Response


VDD=2.5V
3 0.25μm
Vin
W/Ln = 1.5
2.5
W/Lp = 4.5
45
2 Reqn= 13 kΩ (÷ 1.5)
Reqp= 31 kΩ (÷ 4.5)
1.5
tf tr
1 tpHL tpLH tpHL = 36 psec

0.5 tpLH = 29 p
psec

0 so
tp = 32.5 psec
-00.55
0 0.5 1 1.5 2 2.5
x 10-10
t (sec)
From simulation: tpHL = 39.9 psec and tpLH = 31.7 psec

234
Inverter Propagation Delay, Revisited
• To see how a designer can optimize the delay 
of a gate have to expand the Reqq in the delay 
equation 5.5
5
4.5
4
3.5
tpHL
HL = 0.69
0 69 Reqn CL 3
2.5
2
= 0.69 CL ((3/4 VDD)/IDSATn ) 15
1.5
1
0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4
≈ 0.52 CL / ((W/Ln k’n VDSATn )
VDD (V)

235

Design for Performance


• Reduce C
R d CL
– internal diffusion capacitance of the gate itself
• keep the drain diffusion as small as possible
– interconnect capacitance
– fanout
• Increase W/L ratio of the transistor
Increase W/L ratio of the transistor
– the most powerful and effective performance optimization tool in the 
hands of the designer
– watch out for self‐loading! –
watch out for self‐loading! – when the intrinsic capacitance 
when the intrinsic capacitance
dominates the extrinsic load
• Increase VDD
– can trade‐off energy for performance
t d ff f f
– increasing VDD above a certain level yields only very minimal 
improvements
– reliability concerns enforce a firm upper bound on VDD

236
Transistor Effective Resistance
• Let’s Assume that the smallest size NMOS has 
Resistance R. PMOS of the same (smallest size)
Resistance R. PMOS of the same (smallest size) 
will have a resistance = (2R~3R) (because of 
mobility difference).
mobility difference).
• Doubling the transistor width reducing its 
resistance to half
resistance to half.

237

Transistor Sizing in Complex Gates

Rp Rp Rp
1 A B 1 2 B

Rn Rp Cint
CL 2
2 A
B

Rn Rn Rn CL
2 Cint
1
A A B 1

238
Transistor Sizing a Complex CMOS Gate

B
A
C

D
OUT = !(D + A • (B + C))
A
D
B C

239

Transistor Sizing a Complex CMOS Gate

B 4 12
A 2 6
C 4 12

D 2 6
OUT = !(D
( + A • ((B + C))
))
A 2
D 1
B 2C 2

240
Capacitances of Sized Gate

241

Assume that the smallest size NMOS has Resistance R and the gate
capacitance =junction capacitance =C, μn=3 μ p
1. Size the PMOS and NMOS transistor such that the rise and fall
times are close to one anther.
2. Calculate the minimum and maximum delay time in terms of
transistor resistance and self load capacitance.

A A C
j C
B B D

X = !(C
( • ((A + B))
)) X = !((A+B)•(C+D))

C C D
i

A B A B

242
Solution : in class

243

Solution : in class

244
Input Pattern Effects on Delay
• Delay is dependent on the pattern of 
inputs
Rp Rp
• Low to high transition
Low to high transition
A B – both inputs go low
• delay is 0.69 R
y p/
/2 CL L since two p‐
p
Rn CL resistors are on in parallel
A – one input goes low
Rn
• delay is 0.69 Rp CL
Cint
• High to low transition
B
– both inputs go high
b th i t hi h
• delay is 0.69 2Rn CL
• Adding transistors in series (without 
Adding transistors in series (without
sizing) slows down the circuit
245

Delay Dependence on Input Patterns


2-input
put NAND witht
NMOS = 0.5μm/0.25 μm Rp Rp
PMOS = 0.75μm/0.25 μm A B
3 CL = 10 fF
Rn CL
2.5 A=B=1→0 A
2 Rn Cint
A=1 →0, B=1
B
age, V

1.5

1 A=1, B=1→0
Volta

I
Input
t Data
D t D l
Delay
0.5 Pattern (psec)
A=B=0→1 69
0
A=1, B=0→1 62
0 100 200 300 400
-0.5 A= 0→1, B=1 50
time, psec A=B=1→0 35
A=1, B=1→0 76
A= 1→0, B=1 57
246
Exercises
• For the following functions
F = ABC
F = A+ B +C
F = ABC + BD
F = AB + CD + E
Assume that the smallest NMOSFET with W/L=1 has a resistance R, Capacitances 
CDB= C
CSB = C
CG = C .
C
• Implement the function using the static CMOS topology
• Size the PMOS and NMOS transistor such that the  rise and fall times are 
close to one anther Assume that μn=3 μ 
close to one anther. Assume that μ μ p 
• Calculate output self loading capacitance  and the input capacitance in 
terms of C.
i i d i d l ti i terms of 
• Calculate the minimum and maximum delay time in 
C l l t th terms of R and C
R dC

247

Lecture 11
Power
Sources of Power Dissipation
• Power dissipation in CMOS circuits comes from two 
components:
– Dynamic power dissipation :
• Switching power: charging and discharging load 
capacitances as gates switch
capacitances as gates switch
• Short Circuit power: consumed while both pMOS and 
nMOS stacks are partially ON
– Static power  dissipation
• subthreshold leakage through OFF transistors
• gate leakage through gate dielectric
• Junction leakage  from source/drain diffusions

249

Power and Energy


• Power is drawn from a voltage source 
attached to the VDD pin(s) of a chip.

• Instantaneous Power:
Instantaneous Power:P(t ) = I (t )V (t )
T

• Energy:
E E = ∫ P (t ) dt
0
T
E 1
Pavg = = ∫ P (t )dt
• Average Power: T T 0
Power in Circuit Elements

PVDD ( t ) = I DD ( t ) VDD

VR2 ( t )
PR ( t ) = = I R2 ( t ) R
R

∞ ∞
dV
EC = ∫ I ( t )V ( t ) dt = ∫ C V ( t ) dt
0 0
dt
VC

= C ∫ V ( t )dV = 12 CVC2
0

251

Charging a Capacitor
• When the gate output rises
– Energy stored in capacitor is
EC = 12 C LVDD
2

– But energy drawn from the supply is
But energy ∞ drawn from ∞ the
dV
supply is
E V D D = ∫ I ( t )V D D d t = ∫ C L VDD dt
0 0
dt
VDD

= C LV D D ∫0
d V = C LV D2D

– Half the energy from VDD is dissipated in the 
pMOS transistor as heat, other half stored in 
252
capacitor
Switching Waveforms
• Example: VDD = 1.0 V, CL = 150 fF, f = 1 GHz

253

Switching Power
Suppose that the gate switches at some average frequency
fsw. Over some interval T, the load will be charged and
discharged Tfsw times. Then, the average power dissipation
can be given by:

VDD
fsw
iDD((t))

254
Activity Factor
• Suppose the system clock frequency = f
• Let fsw = αf, where α
, = activity factor
y
– If the signal is a clock, α = 1
If the circuit switchs once/period , α = 1/2
– If the circuit switchs once/period , α 1/2

• Dynamic power:
D i
Pswitching = α CVDD 2 f

255

Short Circuit Current


• When transistors switch, both nMOS and 
pMOS networks may be momentarily ON at 
once
• Leads to a blip of “short circuit” current.
p
• < 10% of dynamic power if rise/fall times are 
comparable for input and output
comparable for input and output
• We usually ignore this component

256
Power Dissipation Components
• Ptotal = Pdynamic + Pstatic
• Dynamic power: P
y p dynamic = Pswitching + Pshortcircuit
– Switching load capacitances
– Short‐circuit current
Short circuit current
• Static power: Pstatic = (Isub + Igate + Ijunct )VDD
– Subthreshold leakage
– Gate leakage g
– Junction leakage

257

Dynamic Power Estimation


• The dynamic power is dominated by 
switching power.
• To estimate this power, one can consider the 
p
effective capacitance of each node of the 
circuit.
• The effective capacitance is the actual 
The effective capacitance is the actual
capacitance multiplied by the activity factor.
Dynamic Power Example
• 1 billion transistor chip contains the following:
– 50M logic transistors
• Average width: 12 λ
• Activity factor = 0.1
– 950M memory transistors
950M memory transistors
• Average width: 4 λ
• Activity factor 
Activity factor = 0.02
0.02
– 1.0 V 65 nm process , (λ=25nm).
μ (g ) μ (
– C = 1 fF/μm (gate) + 0.8 fF/μm (diffusion)
)
• Estimate dynamic power consumption @ 1 
GHz. Neglect wire capacitance and short‐
GHz.  Neglect wire capacitance and short
circuit current.
259

Solution

Clogic = ( 50×106 ) (12λ)( 0.025


0025μm/ λ)(1.8
18 fF / μm) = 27 nF
Cmem = ( 950×106 ) ( 4λ)( 0.025μm/ λ)(1.8 fF / μm) =171
171nF
nF

Pdynamic = ⎡⎣0.1Clogic +0.02Cmem ⎤⎦(1.0) (1.0 GHz) = 6.1 W


2

260
Activity Factor Estimation
¾ the activity factor of a node is the probability that it
switches from 0 to 1 (P0→1 ).

Transition probability of a node (out)


αout or P0→1 = Pout=0 x Pout=1
= P0 x (1-P
(1 P0)
2-input NOR Gate
A B Out
With input signal probabilities
0 0 1 PA=1 = 1/2
0 1 0 PB=1 = 1/2
1 0 0

1 1 0 NOR transition probability


α = 3/4 x 1/4 = 3/16

261

NOR Gate Transition Probabilities


¾ Switching activity is a strong function of the input signal
statistics
z PA and PB are the probabilities that inputs A and B are one

0
A B
CL PA
1 0 1
PB

P0→1 = P0 x P1 = (1-(1-PA)(1-PB)) (1-PA)(1-PB)

262
Transition Probabilities for Some Basic Gates

P0→1 = Pout=0 x Pout=1


NOR (1 - (1 - PA)(1 - PB)) x (1 - PA)(1 - PB)
OR (1 - PA)(1 - PB) x (1 - (1 - PA)(1 - PB))
NAND PAPB x (1 - PAPB)
AND (1 - PAPB) x PAPB
XOR (1 - (PA + PB- 2PAPB)) x (PA + PB- 2PAPB)
X
0.5 A
Z
0.5 B

For X: P0→1 =

F Z:
For Z P0→1 =

263

Transition Probabilities for Some Basic Gates

P0→1 = Pout=0 x Pout=1


NOR (1 - (1 - PA)(1 - PB)) x (1 - PA)(1 - PB)
OR (1 - PA)(1 - PB) x (1 - (1 - PA)(1 - PB))
NAND PAPB x (1 - PAPB)
AND (1 - PAPB) x PAPB
XOR (1 - (PA + PB- 2PAPB)) x (PA + PB- 2PAPB)
X
0.5 A
Z
0.5 B
F X:
For X P0→1 = P0 x P1 = (1-P
(1 PA) PA
= 0.5 x 0.5 = 0.25
For Z: P0→1
0 1 = P0 x P1 = (1
(1-PPXPB) PXPB
= (1 – (0.5 x 0.5)) x (0.5 x 0.5) = 3/16
264
Logic Restructuring
‰ Logic restructuring: changing the topology of a logic
network to reduce transitions
AND: P0→1 = P0 x P1 = ((1 - PAPB) x PAPB
3/16
0.5A Y
0.5 (1-0.25)*0.25 = 3/16
A W 7/64 0.5B 15/256
B X F
15/256 0.5
0.5 C C
0.5 D F
0.5 0.5D Z
3/16

Chain implementation has a lower overall 
Chain implementation has a lower overall
switching activity than the tree 
implementation for random inputs
implementation for random inputs
Ignores glitching effects
265

Input Ordering

0.5
0 5 0.2
0 2
A B X
X
B C
F 0.1 A F
0.2 C
0.1 05
0.5

Beneficial to postpone the introduction of 
B fi i l h i d i f
signals with a high transition rate (signals 
with signal probability close to 0.5)
ih i l b bili l 0 5)
266
Input Ordering

(1-0.5x0.2)x(0.5x0.2)=0.09 (1-0.2x0.1)x(0.2x0.1)=0.0196
0.5
0 5 0.2
0 2
A B X
X
B C
F 0.1 A F
0.2 C
0.1 05
0.5

Beneficial to postpone the introduction of 
B fi i l h i d i f
signals with a high transition rate (signals 
with signal probability close to 0.5)
ih i l b bili l 0 5)
267

Dynamic Power Reduction

Pswitching = α CVDD 2 f

• Try to minimize:
– Activity factor
– Capacitance
– Supply voltage
– Frequency

268
Static power dissipation
– Static power  dissipation
• In nanometer technologies, nearly one‐third of the 
power is leakage
power is leakage.
– Components of Static power 
• subthreshold leakage through OFF transistors
leakage through OFF transistors
• gate leakage through gate dielectric
• Junction leakage  from
g source/drain diffusions

269

Low Power Design Techniques


• Power Gating: Turn OFF power to blocks 
when they are idle to save leakage
– Use virtual VDD (VDDV)
– Gate outputs to prevent 
invalid logic levels to next block
• Clock Gating Technique
• Multi‐Vth  Technique
• Multi VDD Technique
Multi V
• Dynamic VDD
• Dynamic Frequency Scaling
270
Static Power Example
• Revisit power estimation for 1 billion transistor 
chip
• Estimate static power consumption
– Subthreshold leakage
Subthreshold leakage
• Normal Vt:  100 nA/μm
• High V
High Vt: : 10 nA/μm
10 nA/μm
• High Vt used in all memories and in 95% of logic gates
– Gate leakage
Gate leakage 5 nA/μm
5 nA/μm
– Junction leakage negligible

271

Solution

Wnormal-Vt =( 50×106 ) (12λ)( 0.025


0025μm// λ)( 0.05
005) = 075
0.75×106 μm

Whigh-Vt = ⎡⎣( 50×106 ) (12λ)( 0.95) +( 950×106 ) ( 4λ) ⎤⎦( 0.025μm/ λ) =109.25×106 μm

Isub = ⎡⎣Wnormal-Vt ×100 nA/μm+Whigh-Vt ×10 nA/μm⎤⎦ /2 =584 mA

( ⎣ )
Igate = ⎡ Wnormal-Vt +Whigh-Vt ×55nA/
nA/μm⎤ /2 = 275

275mA
mA
Pstatic =( 584 mA+275 mA)(1.0 V) =859 mW

272
Lecture 12
Overview on Programmable Logics

Source: The Design Warrior’s Guide to FPGAs ; Clive “Max” Maxfield,


ELSEVIER 2004

Outline
• Introduction
• F d
Fundamental Concepts
t lC t
• g f
The Origin of FPGAs
• Alternative FPGA Architectures
• P
Programming (Configuring) an FPGA
i (C fi i ) FPGA
• FPGA and FPAA vendors
• Varieties of Design Flows

274
Introduction
Q: What are FPGAs?
A: Field programmable gate arrays (FPGAs) are digital integrated 
A:  Field programmable gate arrays (FPGAs) are digital integrated
circuits (ICs) that contain configurable (programmable) blocks of 
logic along with configurable interconnects between these 
blocks. 
• FPGAs may only be programmed a single time (one‐time 
programmable (OTP)) while others may be reprogrammed over
programmable (OTP)), while others may be reprogrammed over 
and over again.
• The “field programmable” portion of the FPGA’s name refers to 
the fact that its programming takes place “in the field” (as 
opposed to devices whose internal functionality is hardwired by 
the manufacturer)
the manufacturer).

275

Introduction
• PLD (SPLD, CPLD):‐ Programmable Logic Devices 
(Simple, Complex)‐ Small size programmable logic 
devices.
• PLD vs. ASIC:‐ Application‐Specific Integrated Circuits‐
An optimized design (power, area, performance) for 
specific company, but the design is frozen in silicon. 
Q: Why are FPGAs of interest?
A: FPGAs come in between PLDs and ASICs because their 
functionality can be customized in the field like PLDs, but they 
can contain millions of logic gates and be used to implement 
extremely large and complex functions that previously could be
extremely  large and complex functions that previously could be 
realized only using  ASICs
276
Introduction
• What can FPGAs be used for?
– prototype ASIC designs or to provide a hardware 
prototype ASIC designs or to provide a hardware
platform on which to verify the physical 
implementation of new algorithms.
– high‐speed input/output (I/O) interfaces.
– communications devices and software‐defined 
radios; radar, image, DSP applications.

277

FPGA vs Microprocessor/controller

Address bus
ALU Register
g
Section
Data bus

Control and timing


section Control bus

Block diagram
g of a microprocessor
p

Microcontroller is running sequentially regardless of


h
how ffastt the
th controller
t ll

278
Fundamental Concepts
• The key thing about FPGAs (Programmability)

Programmed fusible links (Before)


279

Fundamental Concepts
• The key thing about FPGAs (Programmability)

Programmed fusible links (After)


280
Fundamental Concepts
Summary of Programming Technologies

281

The Origin of FPGAs


Technology timeline (dates are approximate)

282
The Origin of FPGAs
• The first programmable ICs were generically referred to 
h f bl ll f d
as programmable logic devices (PLDs) (PROM 1970s)

Unprogrammed PROM (predefined AND array, programmable OR array)

283

The Origin of FPGAs

Programmed PROM (predefined AND array, programmable OR array)

284
PLAs
• Programmable logic arrays (PLAs), available 1975. The most user 
configurable of SPLDs (both the AND and OR arrays were programmable)

Unprogrammed PLA (programmable AND and OR arrays

285

PLAs
Programmed PLA

286
PALs and GALs
• PAL is almost the exact opposite of a PROM because it has a 
programmable AND array and a predefined OR array. In order to 
address the speed problems posed by PLAs
dd th d bl d b PLA

287

CPLDs
• More sophisticated PLD devices known as complex PLDs (CPLDs) (Started 
from early 80s).
• Altera (1984 introduced a CPLD based on a combination of CMOS and
Altera (1984 introduced a CPLD based on a combination of CMOS and 
EPROM.

A generic CPLD structure

288
Structure of FPGA

289

SRAM-based devices
SRAM‐based devices
• The majority of FPGAs are based on the use of 
The majority of FPGAs are based on the use of
SRAM configuration cell
– new design ideas can be quickly implemented and 
new design ideas can be quickly implemented and
tested
– One downside of SRAM‐based devices is that they 
One downside of SRAM based devices is that they
have to be reconfigured every time the system is 
p
powered up.
p

290
Antifuse-based devices
• Antifuse‐based devices
• these devices are nonvolatile (their 
these devices are nonvolatile (their
configuration data remains when the system is 
powered down)
powered down)
• these devices don’t require an external memory 
chip to store their configuration data
hi t t th i fi ti d t
• their interconnect structure is naturally “rad 
hard,” immune to the effects of radiation. Good 
for  Military and Aerospace applications

291

EEPROM/FLASH-based devices
EEPROM/FLASH‐based devices
• Once programmed, the data they contain is 
Once programmed the data they contain is
nonvolatile, so these devices would be “instant 
on” when power is first applied to the system.
on when power is first applied to the system

292
Architecture of Logic Block 1/2
1- MUX-based Architecture

MUX-
based
logic block

3‐input function y = (a & b) | c
293

Architecture of Logic Block 2/2


• LUTs: A group of input signals is used as an index (pointer) to a lookup table. 
The contents of this table are arranged such that the cell pointed to by each 
input combination contains the desired value. Example:

294
A Simplified Xilinx logic cell

295

Connection of CLBs

A CLB containing four slices (the


number of slices depends on the
FPGA family). A slice containing two logic cells

296
Embedded RAMs
FPGAs include relatively large chunks of embedded RAM called e‐RAM or 
block RAM. Depending on the architecture of the component, these blocks 
might be positioned around the periphery of the device scattered across
might be positioned around the periphery of the device, scattered across 
the face of the chip in relative isolation, or organized in columns, as shown

Bird’s-eye
Bird’s e e view
ie
of chip with
columns of
embedded RAM
blocks

297

Embedded multipliers, adders, MACs, etc


Some functions, like multipliers, are inherently slow if they are
implemented by connecting a large number of programmable
logic blocks together. Since these functions are required by a
lot of applications, many FPGAs incorporate special hardwired
multiplier blocks.

multiply-add
lti l dd and-
d
accumulate (MAC)

298
Embedded processor cores (hard and soft)

Some
FPGAs
contains
more than
uP
P
distributed
over the
chip
Birds-eye view of chip with embedded core
outside
t id off the
th main
i fabric
f bi

299

FPGA vendors
• Company             Web site                   Comment
• Actel Corp
Actel Corp. www actel com
www.actel.com                FPGAs
FPGAs
• Altera Corp. www.altera.com             FPGAs
• Anadigm Inc. www.anadigm.com          FPAAs
• Atmel Corp. www.atmel.com              FPGAs
• QuickLogic Corp. www.quicklogic.com       FPGAs
• Xilinx Inc
Xilinx Inc. www xilinx com
www.xilinx.com              FPGAs
FPGAs
There may be more companies existing or even dyeing

300
FPGA Example: Artix-7 100T

• 15,850 logic slices, each with four 6-input LUTs


and 8 flip-flops
• 4,860
, Kbits of fast block RAM
• Six clock management tiles, each with phase-
locked loop (PLL)
• 240 DSP slices
• Internal clock speeds exceeding 450MHz
• On-chip analog-to-digital converter (XADC)

301

Lecture 13
Introduction to VHDL

Source: Circuit Design with VHDL By Volnei A. Pedroni, MIT Press 2004
VHDL

Very High Speed Integrated Circuit (VHSIC) 

Hardware 

D
Description 
i ti

Language 

303

EDA Tools
• EDA (Electronic Design Automation) tools  are 
used for circuit synthesis, implementation, and
used for circuit synthesis, implementation, and 
simulation using VHDL:‐
• FPGA Vendors  offer some place and route tools 
FPGA Vendors offer some place and route tools
to allow the synthesis of VHDL code onto their 
CPLD/FPGA (Example)
CPLD/FPGA (Example).
– Xilinx offers ISE suite
– Altera provide Quartus II
Altera pro ide Q art s II
• It also can be provided by specialized EDA 
companies (ModelSim By Mentor Graphics).
i (M d lSi B M G hi )
304
Fundamental VHDL Units

305

Library Declarations

std and work libs are visible by default (no need to declare!

306
ENTITY

Example:

307

ARCHITECTURE
• There are two general styles for describing 
module functionality behavioral and structural.
module functionality behavioral and structural. 
– Behavioral models describe what a module does. 
– Structural models describe how a module is built 
Structural models describe how a module is built
from simpler pieces.

308
ARCHITECTURE

Example:

309

Example 1

310
Example 2: 1-bit equality comparator

311

Example 3: 32 bit Adder


library IEEE; use IEEE.STD_LOGIC_1164.all;
use IEEE.STD_LOGIC_UNSIGNED.all;

entity adder is
port(a, b: in STD_LOGIC_VECTOR(31 downto 0);
y: out STD_LOGIC_VECTOR(31 downto 0));
end;
architecture synth of adder is
begin
y <= a + b;
end;

312
An example: a comparator in VHDL

A=[a3,a2,a1,a0] equals
B=[b3,b2,b1,b0]

a3
a2
a1 b3 The comparator
a0 chip: eqcomp4 equals
b2
b1
b0
313

An example of a comparator
entity eqcomp4 is
port (a, b: in std_logic_vector(3 downto 0 );
equals: out std_logic);
end eqcomp4;

architecture dataflow1 of eqcomp4 is
begin
equals <= '1' when (a = b) else '0’;
‐‐ “comment”
comment  equals is active high
equals is active high
end dataflow1;

314
Exercise 1.1
• In the eqcomp4 VHDL code:
– How many IO pins?
y p
– What are their names and types?
– What are the meanings of std_logic and 
What are the meanings of std logic and
std_logic_vector?

315

Exercise 1.2
1    entity test1 is
2    port (in1,in2:  in bit;
3           out1: out bit;
4   end test1;
5
6   architecture test1arch of test1 is
7   begin
8 out1<= in1 or in2; 
9   end test1_arch;
– Give line numbers of  (i) entity declaration, and (ii) 
Gi li b f (i) tit d l ti d (ii)
architecture? Also find an error in the code.
– What are the functions of (i) entity declaration and (ii) 
architect re?
architecture?
– Draw the chip and names the pins. (Don’t forget the two 
most important pins)
– U d li
Underline the words that are user defined in the above 
h d h d fi d i h b
VHDL code.
316
Exercise 1.3
• Rewrite  example 2, with
– Entity name is not test1 but test1b
Entity name is not test1 but test1b
– Inputs are not in1 and in2 but a, b, resp.
– Output is not out1 but out1b
Output is not out1 but out1b
– Logic type is not bit but std_logic
– Architecture name is not test1arch but test1b_arch. 
Architecture name is not test1arch but test1b arch

317

IN, OUT, INOUT, BUFFER IO modes


• IN: data flows in, like an input pin
• OUT: data
data flows out, just like an output. The output 
flows out, just like an output. The output
cannot be read back by the entity
• INOUT: bi‐directional, used for data lines of a CPU etc.
,
• BUFFER: similar to OUT but it can be read back by the 
entity. Used for control/address pins of a CPU etc.

318
Exercise 1.4
• Draw the schematics of the four types of IOs
• Based on the following schematic,  name and 
Based on the following schematic name and
identify the modes of the IO pins.

319

Exercise 1.5:
Draw the schematic circuit corresponding to the following VHDL
code

1  entity test is
2    port (in1 : in std_logic_vector (2 downto 0);
3           out1 : out std_logic_vector (3 downto 0));
4   end test;
5 architecture test arch of test is
5   architecture test_arch of test is
6     begin
7         out1(0)<=in1(1);
8
8         out1(1)<=in1(2);
1(1) i 1(2)
9         out1(2)<=in1(0) and in1(1);
10       out1(3)<=‘1’;
11    end test_arch ;

320
Exercise 1.6:
1. Write the entity of this device
2. Fill in the truth table and write the VHDL code

In1 in2 out00 out10 out11 out01

321

Knowledge Skills Should be Fulfilled


• You should know 
– Entity
Entity 
– Entity declaration
– Use of port()
Use of port()
– Modes of IO signals
– Structure of  a simple Architecture body
Structure of a simple Architecture body

322
Design Synthesis
• The synthesizer converts HDL (VHDL/Verilog) 
code into a gate‐level
code into a gate level netlist (represented in 
netlist (represented in
the terms of the component library.
• After synthesis , CAD can show RTL Schematic 
After synthesis CAD can show RTL Schematic
and Technology Schematic

323

Example
• Code:

324
RTL Schematics

325

Technology Schematic

326
Design Implementation (1)
• Design implementation is the process of 
translating, mapping, placing, routing, and
translating, mapping, placing, routing, and 
generating a bitstream file for the  design.
• Place and route (PAR)
Place and route (PAR)
– Place and route is the most important and time 
consuming step of the implementation It defines
consuming step of the implementation. It defines 
how device resources are located and 
interconnected inside an FPGA to form the design. 
g
– Bad placement would make good routing 
impossible.

327

Inside FPGA

328
Generate Programming File
• Generate Programming File. This will generate 
the configuration bits into a (*.bit)
the configuration bits into a ( .bit) file. The file  
file. The file
is used to program the target FPGA board to 
behave like the designated circuit.
behave like the designated circuit.

329

Configuring the Target Device


• The Process of downloading the programming 
file into the target device
file into the target device

330
Lecture 14
Sequential Circuits

331

Sequential Logic

Inputs Outputs
Combinational
Logic

Current Next
State State

clock

332
Timing Metrics

clock
clock

tsu thold time

In data
stable
tc-q time

Out output output


stable stable
time

333

Latches vs Flipflops
• Latches
– level sensitive circuit that passes inputs to Q when the 
clock is high (or low) ‐ transparent mode
– input sampled on the falling edge of the clock is held 
stable when clock is low (or high) ‐ hold mode
stable when clock is low (or high) 
• Flipflops (edge‐triggered)
– edge sensitive
edge sensitive circuits that sample the inputs on 
circuits that sample the inputs on
a clock transition
• positive edge
positive edge‐triggered: 0→1
triggered: 0 →
• negative edge‐triggered: 1 → 0
– built using latches (e.g., master
built using latches (e.g., master‐slave
slave flipflops)
flipflops)

334
SR Latch

S R Q !Q
S 0 0 Q !Q memory
!Q
1 0 1 0 set

0 1 0 1 reset
Q
R 1 1 0 0 disallowed

335

Clocked D Latch
D
!Q

Q
D Q

clock

transparent mode clock

clock

hold mode

336
MUX Based Latches
¾ Change
C the stored value by cutting the ffeedback loop

feedback feedback
1 0
Q Q
D 0 D 1

clk clk

Negative Latch Positive Latch

Q = clk & Q | !clk & D Q = !clk & Q | clk & D


p
transparent when the transparent
p when the
clock is low clock is high

337

TG MUX Based Latch Implementation


clk

!clk

input sampled
D (transparent mode)

clk
clk
D Q
!clk

clk feedback
(hold mode)
338
PT MUX Based Latch Implementation

clk !Q

D Q

input sampled
(transparent mode)
!clk
‰ Reduced clock load, but
clk
threshold drop at output
of pass transistors so !clk
reduced noise margins
and performance
feedback
(hold mode)
339

Master Slave Based ET Flipflop


D Q

0
1 Q clock
1
QM
D 0
clk clk
clk
Sl
Slave D
Master

clk = 0 transparent hold QM

clk = 0→1 hold transparent


p Q

340
MS ET Implementation
Master Slave
S a e

I2 T2 I3 I5 T4 I6 Q
QM

I1 T1 I4 T3
D

clk
lk

clk

!clk

MS NET Implementation
Master Slave
S a e

I2 T2 I3 I5 T4 I6 Q
QM

I1 T1 I4 T3
D

clk
lk

clk

!clk
MS ET Implementation
Master Slave
S a e

I2 T2 I3 I5 T4 I6 Q
QM

I1 T1 I4 T3
D

clk
lk

master transparent master hold


slave hold slave transparent
clk

!clk

MS ET Timing Properties
• Assume propagation delays are tpd_inv and 
pd tx,, that the contamination delay is 0, and 
tpd_tx y ,
that the inverter delay to derive !clk is 0
• Set
Set‐up
up time
time ‐ time before rising edge of clk 
time before rising edge of clk
that D must be valid

• Propagation delay ‐ time for QM to reach Q

• Hold time ‐ time D must be stable after 
rising edge of clk ‐
d f lk
344
MS ET Timing Properties
• Assume propagation delays are tpd_inv and 
pd tx,, that the contamination delay is 0, and 
tpd_tx y ,
that the inverter delay to derive !clk is 0
• Set
Set‐up
up time
time ‐ time before rising edge of clk 
time before rising edge of clk
that D must be valid
3 * tpd_inv
pd inv + tpd_tx
pd tx

• Propagation delay ‐ time for QM to reach Q


tpd_tx + tpd_inv

• Hold time ‐ time D must be stable after rising 
edge of clk 
d f lk zero

Set-up Time Simulation


3
Q
2.5

2 tsetup = 0.21 ns
QM
15
1.5
Volts

1 D clk
05
0.5
I2 out
0

-0.5 works correctly


0 0.2 0.4 0.6 0.8 1
Time (ns)

346
Set-up Time Simulation
3
Q
2.5
I2 out tsetup = 0.20 ns
2

1.5
Volts

1
D clk
05
0.5

0 QM

-0.5 fails
0 0.2 0.4 0.6 0.8 1
Time (ns)

347

Propagation Delay Simulation

2.5

2 tc-q(LH) = 160 psec


1.5
Volts

1 tc-q(LH) tc-q(HL) = 180 psec


tcc-q(HL)
q(HL)
0.5

-0.5
0 0.5 1 1.5 2 2.5
Time (ns)

348
Static vs Dynamic Storage
• Static storage
– preserve
preserve state as long as the power is on
state as long as the power is on
– have positive feedback (regeneration) with an internal 
connection between the output and the input
– useful when updates are infrequent (clock gating)
• Dynamic storage
– store state on parasitic capacitors
– only hold state for short periods of time (milliseconds)
– require periodic refresh
– usually simpler, so higher speed and lower power

349

Review: The Regenerative Property


Vi1 Vo1 Vi2 Vo2

cascaded inverters

A If the gain in the transient


region is larger than 1,
C
only A and B are stable
operation points. C is a
metastable operation
B point.
Vi1 = Vo2

350
Bistable Circuits
V
• The cross‐coupling of two inverters  i1
results in a bistable circuit (a circuit  V
i2
with two stable states)
¾ Have to be able to change
g the stored value by y makinggA
(or B) temporarily unstable by increasing the loop gain to
a value larger than 1
z done by applying a trigger pulse at Vi1 or Vi2
z the width of the trigger pulse need be only a little larger than the
total propagation delay around the loop circuit (twice the delay of
an inverter)

¾ Two approaches used


z cutting the feedback loop (mux based latch)
z overpowering the feedback loop (as used in SRAMs)

351

Lecture 15
Memory Circuits

352
Outline
• Memory Architecture 
• SRAM Architecture
SRAM Architecture
– SRAM Cell
– Decoders
D d
– Column Circuitry
• DRAM Architecture

353

Memory (Array) Design


• Array of bits
• Area very important
Area very important
– Memory takes considerable area in processor chips
– Compaction results in fewer memory chip modules, 
C ti lt i f hi d l
more on‐chip cache
• Ti
Timing and power consumption of memory 
i d ti f
blocks have significant impact on the system
• Different types
– RAM (SRAM, DRAM, CAM)
– ROM (PROM, EEPROM, FLASH)
354
Memory Architect
• Address: which one of the M words to access
• Data: the N bits of the word are read/written
Data: the N bits of the word are read/written
S0 Word 0 Address
S1 decoder
Word 1 Storage
St S0
S2 Word 0
cells S1
Word 1
... ... S2

De
A0

ecoder
A1
SM-2 Word M-2 ... ...
Word M-1
...
SM-1 Ak-1 SM-2
N bit
bits SM-1 Word
o d M-2
Word M-1
word
select k = log2 (M) N bits
lines
355

Memory Cell Array Considerations


S0
• Memory performance (speed) Word 0
S1
Word 1
– Storage cell speed (read, write) S2
Word 2
– Data bus capacitance
– Periphery: address decoders, sense  A0
Dec

amplifiers, buffers A1
coder

• Memory area ... ...


– Cell array layout ...
• How to layout the cells array?
How to layout the cells array? Ak-1
k1

– Linear is bad: SM-2
Word M-2
• Long data busses Î large capacity SM-1
Word M-1
• A lot of cells connected to data bus
N bits
• Decoder will have a lot of logic 
levels SenseAmp /
Drivers
N bits
356
Memory Cell Array Layout (cont.)
• Group the M words into M/L rows, each containing L words
• Benefits?   
S0..L-1
0 L1
Word 0 Word 1 ... Word L-1
SL..2L-1
address:
Row Decode
Alog L Word L Word L+1 ... Word 2L-1
S2L..3L-1 Word 2L Word 2L+1 ... Word 3L-1
Alog L+1
L 1 L bits
...
... ... ... ...
Ak-1
er

k1
SM-L..M-1 Word M-L k-L bits
... ... Word M-1
N bits N bits ... N bits
SAmp/Drv SAmp/Drv ... SAmp/Drv
N bits N bits ... N bits
A0
... Column Decoder + MUX
Alog L-1
N bits
357

Memory Cell Array Access Example


• word=16‐bit wide(N),  row=8 words(L),  address=10 bits (k)
d 16 bi id (N) 8 d (L) dd 10 bi (k)
• Accessing word 9= 00000010012
L=8 words
S0..7
S8..15 Word 0 Word 1 ... Word 7
Row

Word 8 Word 9 ... Word 15


1 A3 S16..23
w Decod

16 23 Word 16 Word 17 ... Word 23


0 A4
0 M/L =
… ... ... ... ... ... 1024/8=
der

0 A9 128 rows
S1016-1023
Word 1016 ... ... Word 1023
16 bits 16 bits ... 16 bits
SAmp/Drv SAmp/Drv ... SAmp/Drv
16 bits 16 bits ... 16 bits
1 A0
0 A1 Column Decoder + MUX
0 A2
16 bits
358
Hierarchical Memory Structure
• Taking the idea one step further
k h d f h
– Shorter wires within each block
– Enable only one block addr decoderÎ
E bl l bl k dd d d Î power savings
i

Row
Address

Column
Address
Block Blk EN
Blk EN
Address Blk EN
Blk EN
Global Bus

SAmp// Global
SA Gl b l d
drivers/
i /
Drv sense amplifiers

359

Memory Access Timing: the Big Picture


• Timing:
– Send address on the address lines,
Send address on the address lines
wait for the word line to become stable
– Read/write data on the data lines
Read/write data on the data lines
Read Cycle

READ
Read Access Read Access Write Cycle

WRITE
Write Access
Data Valid Data Written

DATA
360
12T SRAM Cell
• Basic building block: SRAM Cell
– Holds one bit of information, like a latch
Holds one bit of information like a latch
– Must be read and written
• 12‐transistor (12T) SRAM cell
12 transistor (12T) SRAM cell
– Use a simple latch connected to bit_line
6 λ unit cell
– 46 x 75 λ i ll
bit
write

write_b

read

read_b

361

6T SRAM Cell
• Cell size accounts for most of array size
– Reduce cell size at expense of complexity
p p y
• 6T SRAM Cell
– Used in most commercial chipsp
– Data stored in cross‐coupled inverters
• Read: bit bit_b
– Precharge bit, bit_b word
– Raise wordline
• Write:
– Drive data onto bit, bit_b
Drive data onto bit, bit b
– Raise wordline
362
SRAM Read
• Precharge both bitlines high
• Then turn on wordline
Then turn on wordline
• One of the two bitlines will be pulled down by 
th
the cell
ll bit bit_b
word

• Ex: A = 0, A_b = 1 N2
A
P1 P2

A_b
N4

– bit discharges, bit_b stays high
N1 N3

– But A bumps up slightly A_b bit_b

• Read stability
1.5

1.0
word bit

– A must not flip
p 0.5

A
0.0
0 100 200 300 400 500 600
time (ps)

363

SRAM Write
• Drive one bitline high, the other low
• Then turn on wordline
Then turn on wordline
• Bitlines overpower cell with new value
• Ex: A = 0, A_b = 1, bit = 1, bit_b = 0 word
bit bit_b

– Force A_b low, then A rises high
P1 P2
N2 N4
A A_b
N1 N3

• Writability A_b

– Must overpower feedback inverter
p 1.5

bit_b
A

1.0

0.5
word

0.0
0 100 200 300 400 500 600 700
time (ps)

364
SRAM Sizing
• High bitlines must not overpower inverters 
during reads
during reads
• But low bitlines must write new value into cell

bit bit_b
word
weak
med med
A A_b
strong

365

Decoders
• n:2n decoder consists of 2n n‐input AND gates
– One needed for each row of memory
One needed for each row of memory
– Build AND from NAND or NOR gates
Static CMOS
Static CMOS        Pseudo nMOS
Pseudo‐nMOS
A1 A0 A1 A0

word0 1 1 8 word0 1/2 4 16


word word
A1 1 4 A0 A1 2 8
word1 word1 1 1
A0 1
word2 word2

word3 word3

366
Large Decoders
• For n > 4, NAND gates become slow
– Break large gates into multiple smaller gates
Break large gates into multiple smaller gates
A3 A2 A1 A0

word0

word1

word2

word3

word15

367

Dynamic RAM 4-Transistor Cell


• 4‐transistor cell data in data out

• Dynamic charge 
Dynamic charge WR

storage must be 
refreshed
• Dedicated busses for  Rd
reading and writing
di d iti

368
Dynamic RAM 3-Transistor Cell
• 3‐transistor cell
precharge
– No p
No p‐type
type transistors 
transistors
data in data out
yield a very compact 
layout for cell WR

– No Vdd connection
– Sense Amplifier must 
p
be able to quickly  Rd
detect dropping voltage
– Precharge data_out’
to generate ‘1’
outputs
p

369

Dynamic RAM 3-Transistor Cell: Timing

WR precharge

Rd data in data out


WR
X Vdd-VT X
Vdd
data in
Rd
data out Vdd ΔV

Value stored at node X when writing a “1”=V
g WR‐VTn

370
Dynamic RAM 1-Transistor Cell
• 1‐transistor cell Precharge to
middle voltage
– Storage capacitor is source of  level
Bi
cell transistor
Si (WL)
– Special processing steps to make 
the storage capacitor large
h i l
– Charge sharing with bus  Storage
capacitance capacitor
(Ccell << Cbus)
– Extra demand on sense amplifier 
to detect small changes
– Destructive read (must write 
immediately)

371

Dynamic RAM 1-Transistor Cell: Timing


Write "1" Read "1"
WL BL
WL
X
GND Vdd-VT X
Vdd Cs
BL
Vdd/2 Vdd/2
sensing
i CBL

• Write: Cs is charged/discharged
• Read
– Voltage swing is small (
Voltage swing is small (~250
250 mV)
mV)
– ΔV = VBL ‐ VPRE = (VX ‐ VPRE) . Cs / (Cs+CBL) 
372
Dynamic RAM 1-Transistor Cell: Observations

• DRAM memory cell is single‐ended
• Read operation is destructive
Read operation is destructive
• Unlike 3T cell, 1T cell requires presence of an 
extra capacitance that must be explicitly 
t it th t tb li itl
included in the design
– Polysilicon‐diffusion plate capacitor
– Trench or stacked capacitor
• When writing a “1” into a DRAM cell, a 
threshold voltage is lost
– Set WL to a higher value than Vdd
373

Memory Design (cont.)


• Static (SRAM)
– Data stored as long as supply voltage is applied
– Large (6 transistors/cell)
– Fast
• Dynamic (DRAM)
– Periodic refresh required
• Refreshing: read, then write back to restore charge
• Either periodically or after each read
– Small (1‐3 transistors/cell)
– Slower
– Special fabrication process
S i lf b i i

374
Column Circuitry
• Some circuitry is required for each column
– Bitline conditioning
Bitline conditioning
– Sense amplifiers
– Column multiplexing
Column multiplexing

375

Bitline Conditioning
• Precharge bitlines high before reads

φ
bit bit_b

• Equalize bitlines to minimize voltage difference 
when using sense amplifiers

bit bit b
bit_b

376
Sense Amplifiers
• Bitlines have many cells attached
– Ex: 32
Ex: 32‐kbit
kbit SRAM has 256 rows x 128 cols
SRAM has 256 rows x 128 cols
– 128 cells on each bitline
• tpd ∝ (C/I) ΔV
(C/I) ΔV
– Even with shared diffusion contacts, 64C of diffusion 
capacitance (big C)
capacitance (big C)
– Discharged slowly through small transistors (small I)
• SSense amplifiers
lifi are triggered on small voltage 
i d ll l
swing (reduce ΔV)

377

Differential Pair Amp


• Differential pair requires no clock
• But always dissipates static power
But always dissipates static power

P1 P2
sense_b sense
bit N1 N2 bit b
bit_b

N3

378
Clocked Sense Amp
• Clocked sense amp saves power
• Requires sense_clk after enough bitline swing
Requires sense clk after enough bitline swing
• Isolation transistors cut off large bitline 
capacitance bit bit_b

sense_clk isolation
transistors

regenerative
feedback

sense sense_b

379

Column Multiplexing
• Recall that array may be folded for good aspect 
ratio
• Ex: 2 kword x 16 folded into 256 rows x 128 
columns
– Must select 16 output bits from the 128 columns
– Requires 16 8:1 column multiplexers
R i 16 8 1 l lti l

380
Flash Memory
• In Flash Memory, stored data exists even when 
memory device is not electrically powered.
memory device is not electrically powered. 
• It's an improved version of electrically erasable 
programmable read only memory (EEPROM)
programmable read‐only memory (EEPROM).
• The difference between Flash Memory and 
EEPROM i
EEPROM  is, 
– EEPROM erases and rewrite its content one byte at a 
ti
time (byte level). Where as Flash memory erases or 
(b t l l) Wh Fl h
writes its data in entire blocks, which makes it a very 
fast memory compared to EEPROM
fast memory compared to EEPROM.

381

• The flash memory is also termed as Solid‐state 
Storage Device (SSD) due to the absence of
Storage Device (SSD) due to the absence of 
moving parts in comparison to traditional 
computer hard disk drive.
computer hard disk drive.
• The two main types of flash memory are the 
NOR Flash & NAND Flash
NOR Flash & NAND Flash.
• Intel is the first company to introduce 
commercial (NOR type) flash chip in 1988 and 
i l (NOR t ) fl h hi i 1988 d
Toshiba released world's first NAND‐flash in 
1989.
1989
382
Core Element (FGMOSFET)

Floating - gate transistor: ( a ) elements of the transistor


structure and ( b ) circuit symbol.

383

Vth Changing

384
Floating-Gate Transistor: Programming

20 V 0V 5V

20 V 0V 5V
10V→ 5V -5 V - 2.5 V
- - - - - - - - - -

S - - D S D S D

Removing Programming
Avalanche injection.
j programming voltage results in
leaves charges higher VT
trapped

385

NOR Array Architecture

386
Lecture 16
Testing, Packaging and ESD Protection

Testing
• Testing is one of the most expensive parts of 
chip production process.
chip production process.
– Logic verification accounts for > 50% of design effort 
for many chips
y p
– Debug time after fabrication has enormous 
opportunity cost
pp y
– Shipping defective parts can sink a company
• Example: Intel FDIV bug (1994)
Example: Intel FDIV bug (1994)
– Logic error not caught until > 1M units shipped
– Recall cost $450M (!!!)
Recall cost $450M (!!!)

388
Logic Verification
• Does the chip simulate correctly?
– Usually done at HDL level
Usually done at HDL level
– Verification engineers write test bench for HDL
• Can
Can’tt test all cases
test all cases
• Look for corner cases
• Try to break logic design
• Ex: 32‐bit adder
– Test all combinations of corner cases as inputs:
Test all combinations of corner cases as inputs:
• 0, 1, 2, 231‐1, ‐1, ‐231, a few random numbers
• Good tests require ingenuity
Good tests require ingenuity

389

Silicon Debug
• Test the first chips back from fabrication
– If you are lucky, they work the first time
– If not…
If t
• Logic bugs vs. electrical failures
– Most chip failures are logic bugs from inadequate simulation
– Some are electrical failures
• Crosstalk
• Dynamic nodes: leakage, charge sharing
y g , g g
• Ratio failures
– A few are tool or methodology failures (e.g. DRC)
• Fix the bugs and fabricate a corrected chip
Fix the bugs and fabricate a corrected chip

390
Manufacturing Test
• A speck of dust on a wafer is sufficient to kill 
chip
• Yield of any chip is < 100%
– Must test chips after manufacturing before delivery 
Must test chips after manufacturing before delivery
to customers to only ship good parts
• Manufacturing
Manufacturing testers are 
testers are
very expensive
– Minimize time on tester
– Careful selection of 
test vectors
391

Manufacturing Failures

392
Stuck-At Faults
• How does a chip fail?
– Usually failures are shorts between two conductors 
Usually failures are shorts between two conductors
or opens in a conductor
– This can cause very complicated behavior
This can cause very complicated behavior
• A simpler model: Stuck‐At
– Assume all failures cause nodes to be 
Assume all failures cause nodes to be “stuck
stuck‐at
at” 0 or 
0 or
1, i.e. shorted to GND or VDD
– Not quite true, but works well in practice
Not quite true but works well in practice

393

Examples

394
Observability & Controllability
• Observability: ease of observing a node by 
watching external output pins of the chip
watching external output pins of the chip
• Controllability: ease of forcing a node to 0 or 1 
by driving input pins of the chip
by driving input pins of the chip
• Combinational logic is usually easy to observe 
and control
d t l
• Finite state machines can be very difficult, 
requiring many cycles to enter desired state
– Especially if state transition diagram is not known to 
the test engineer
395

Test Pattern Generation


• Manufacturing test ideally would check every 
node in the circuit to prove it is not stuck.
node in the circuit to prove it is not stuck.
• Apply the smallest sequence of test vectors 
necessary to prove each node is not stuck
necessary to prove each node is not stuck.
• Good observability and controllability reduces 
number of test vectors required for 
b ft t t i df
manufacturing test.
– Reduces the cost of testing
– Motivates design‐for‐test

396
Test Example
SA1 SA0
A3 {0110} {1110}
A3 1
n1
A2 {1010} {1110} A2
A1 {0100} {0110} Y
A0 {0110} {0111} n2 n3
A1
N1 {1110} {0110}
A0
N2 {0110} {0100}
N3 {0101} {0110}
Y {0110} {1110}
Minimum set: {0100, 0101, 0110, 0111, 1010, 1110}

397

Input / Output
• Input / Output System functions
– Communicate between chip and external world
Communicate between chip and external world
– Drive large capacitance off chip
– Operate at compatible voltage levels
Operate at compatible voltage levels
– Provide adequate bandwidth
– Limit slew rates to control di/dt noise
Limit slew rates to control di/dt noise
– Protect chip against electrostatic discharge
– Use small number of pins (low cost)
U ll b f i (l t)

398
I/O Pad Design
• Pad types
– VDD / GND
/ GND
– Output
– Input
– Bidirectional
– Analog

399

ESD Protection
• Static electricity builds up on your body
– Shock delivered to a chip can fry thin gates
Shock delivered to a chip can fry thin gates
– Must dissipate this energy in protection circuits 
before it reaches the gates
before it reaches the gates
Diode
• ESD protection circuits clamps

– Current limiting resistor
Current limiting resistor R
PAD
– Diode clamps
Current Thin
limiting gate
resistor oxides

400
Packages
• Package functions and Features
– Electrical connection of signals and power from chip 
Electrical connection of signals and power from chip
to board
– Little delay or distortion
Little delay or distortion
– Mechanical connection of chip to board
– Removes heat produced on chip
Removes heat produced on chip
– Protects chip from mechanical damage
– Compatible with thermal expansion
Compatible with thermal expansion
– Inexpensive to manufacture and test

401

Package Types
• Through‐hole vs. surface mount

• DIP (Dual in‐line), PLCC (Plastic Leaded Chip Carrier), PGA (Pin Grid Array) , 
BGA(Ball Grid Array), QFP(Quad Flat ), TSOP(Thin Small‐Outline)

402

You might also like