Advanced Digital IC Design
How Does FPGA Work
Arnaud Taffanel
Peyman Pouyan
Outline
FPGA Basics
Virtex 5
Power Consumption in FPGAs
Low Power Approaches
2008-2-19
CMOS Design Styles
STANDARD
IC
ASIC
FPGA Basics
FULL
CUSTOM
STANDARD
CELL
Programmable
Logic
SEMICUSTOM
GATE ARRAY,
SEA OF GATES
FPGA
CPLD
Programmable Logic Main Idea
Different Programmable Logic Devices
Basic idea: two-dimensional array of logic blocks
and flip-flops with a means for the user to configure
Types of programmable logic
Programmable Logic Array (PLA)
The interconnection between the logic blocks
The function of each block.
Programmable AND Logic (PAL)
Complex Programmable Logic Device
(CPLD)
Field Programmable Gate Array (FPGA)
Programmable Logic Array (PLA)
A
Programmable AND Logic (PAL)
Two programmable
planes
Any combination of
ANDs / Ors
One programmable
plane - AND / fixed
OR
Sharing of AND terms
across multiple Ors
Finite combination
of ANDs / Ors
Programmable switches
between horizontal and
vertical lines
Fewer switch
count
Programmable
OR array
Faster than PLAs
Programmable
AND array
Q0
Q1
Q2
Q3
Fixed OR
Q0
Q1
Q2
Q3
Programmable
AND array
Complex Programmable Logic Devices
(CPLD)
PAL / PLA
Registers
I/O
Interconnect
includes
Full crossbar
Logic
Block
Logic
Block
Logic
Block
Logic
Block
Logic
Block
Programmable
interconnect
Logic Block
contains
Logic
Block
Logic
Block
Why FPGAs?
I/O
Logic
Block
Partial interconnect
Why FPGAs? (Contd.)
Custom ICs sometimes designed to replace the large
amount of glue logic
Reduced system complexity and manufacturing cost,
improved performance
Custom ICs are very expensive to develop
Custom ICs have a long delay to fabricate (time to market)
Need to worry about two kinds of costs
Development cost sometimes called non-recurring
engineering (NRE)
Manufacturing Cost
Why FPGAs? (Contd.)
Custom IC approach suitable only for products
With very high volume (which decrease the NRE)
Not time to market sensitive
FPGAs introduced as an alternative to custom ICs
Improved density relative to discrete SSI/MSI components
With the aid of computer aided design (CAD) tools circuits could
be implemented in a short amount of time relative to ASICs
No physical layout process, no mask making, no IC manufacturing
Lowers NRE
Shortens TTM
Programmable Elements
Overview
Why FPGAs? (Contd.)
FPGAs
Compete with custom ICs
Compete with microprocessors in dedicated and
embedded applications
Summary
performance
NREs
Unit
cost
TTM
ASIC
FPGA
MICRO
ASIC
FPGA
MICRO
FPGA
MICRO
ASIC
ASIC
FPGA
MICRO
Programmable Elements Overview
(Contd.)
Antifuse
Computer Aided Design
Programmable Elements Overview
(Contd.)
SRAM
Configuration Memory Cell
Read or Write
Data
Routing Connections
Field Programmable Gate Arrays
Yes(in-system)
No
Configurable Logic Block (CLB)
Look-up table (LUT)
Yes
No
Register
Or any kind of logic
Adder, Multiplier, Memory,
Microprocessor
Input/Output Block (IOB)
Special logic blocks at periphery of
device for external connections
Programmable interconnect
Wires to connect inputs and
Medium
Low
outputs to logic blocks
LUT based Logic Block:
Field Programmable Gate Arrays
(Contd.)
Other FPGA building blocks
Clock distribution
Embedded memory blocks
Special purpose blocks
DSP blocks
Hardware multipliers, adders and registers
Embedded microprocessors/microcontrollers
High-speed serial transceivers
A transmission gate-based LUT
2-Input MUX as A Programmable Logic
Configuration
MUX based Logic Block:
a&b
F=
0
0
0
0
X
Y
Y
1
1
1
0
X
Y
Y
0
0
1
0
0
1
0
1
1
X
Y
X
X
X
Y
1
0
X
Y
XY
XY
XY
X+ Y
X
Y
1
Programmable Interconnect
Fast local interconnect
Horizontal and vertical lines of various lengths
Switch Matrix Structure
Switch matrix programming illustration
Switch matrixes
Switch Matrix
Interconnect
Switch Matrix Structure (Contd.)
6 pass transistors per switch matrix interconnect
point
FPGA Variations
Families of FPGAs differ in
Physical means of implementing user
programmability
Arrangement of interconnection wires
Basic functionality of the logic blocks
Pass transistors act as programmable switches
Pass transistor gates are driven by configuration
memory cells
Most significant difference is in the method
for providing flexible blocks and connections
Sea-Of-Module Architecture
Sea-Of-Module Architecture Routing
Channel Architecture
Actel FPGA
8 input, single output combinational logic blocks
Rows of programmable logic building blocks
I/O Buffers, Programming and Test Logic
Anti-fuse Technology
I/O Buffers, Programming and Test Logic
I/O Buffers, Programming and Test Logic
Logic Module
Computer Aided Design
I/O Buffers, Programming and Test Logic
rows of interconnect
Wiring Tracks
Actel Logic module
Actel Logic module (Contd.)
Basic module is a modified 4:1 multiplexer
SOA
S0
Implementation of S-R Latch using actel
FPGA
S1
D0
"0"
2:1 MUX
2:1 MUX
"0"
D1
2:1 MUX
2:1 MUX
"1"
D2
2:1 MUX
2:1 MUX
SOB
D3
XC4000 FPGA Architecture
Actel Interconnect
Logic Module
Horizontal
Track
Vertical
Track
Anti-fuse
SRAM cells throughout the FPGA determine the
functionality of the device
XC4000E CLB
2 Four input function
Generators(LUTS)
1 Three-input function
2 Registers
Possible functions:
Any fct of 5 var
Two fcts of 4 var+one
Fct of 3 var
Example
Implement the following functions on a single
CLB of the XC4000 FPGA:
X = AB (C + D)
Y = AK + BK + CDK + AEJL
Use look up table F to implement X
Use look up table G for AEJL
Use F, G and H for Y:
Y = K(A+B + CD) + AEJL
= KX + AEJL= KF+G
Example
Virtex 5
Virtex 5
High end of the Xilinx FPGA's Family
High performance (550MHz)
Low-power conception
65nm CMOS process
CLBs
2 slices per CLB
Slices
4 Register
4 LUT
Carry logic
6-input LUT : More efficients logic (ie.
4->1 Mux)
CLBs
Slices
Slices (contd)
SLICEL
Many configuration possible
Slices (contd)
SLICEM
Only Rom or LUT
64x1
64x1Single
Singleport
portRAM
RAM
32x1
Dual
port
32x1 Dual portRAM
RAM
32
32stages
stagesShift
ShiftRegister
Register
64x1
64x1ROM
ROM
LUT
LUT
6 Input LUT
Optimize common logic implementation
DSP Slices
High-performances DSP-Slices
Up to 250 GMACs!
Single precision float optimized
40 Operating mode adaptable dynamically
RAM
2 Types of RAM usable
Distributed RAM
Used the LUTs as RAM
Closed to the logic
Less RAM
Use CLBs
Power Consumption In FPGAs
Global RAM
More RAM
555MHz
Far from the logic
Routing problem
Speed problem
Power Consumption in FPGAs
1-Static power (5% to 20% )
Leakage current: reverse biased diode
leakage current
Sub-threshold conduction of transistors
Dynamic power consumption in
FPGA design
Clock frequency
Supply voltage
Switching activity
Resource utilization
2-Dynamic power (80% to 95% )
PD =1/2*C*Vdd^2** fclk
It is consumed at the time of output
switching of a CMOS circuit
Power dissipation distribution in Xilinx
Virtex-II FPGA
Ways to Reduce Dynamic Power
Low Power Approaches in
FPGAs
Using Built-in Macro Functions for
Low Power
Idea: Alternative techniques which use less
routing resources than the traditional
techniques
1- Low power implementation of register
files
Frequency Reduction
Voltage Scaling
Capacitance Reduction
Input capacitance of the fan-out gates,
Capacitance associated with Programmable
interconnects
Parasitic capacitance of the gate.
Switching Activity Reduction
Switched Capacitance Reduction
Resource Utilization Reduction
2- Low power implementation of
shift registers
Xilinx SRLUT input and output ports
3-Low power implementation of
multiplier and accumulators
Virtex 5 Low Power
65nm CMOS process
Potentially more leakage current
Solved by Triple Oxide Process Technology
Virtex 5 Triple Oxide
65nm process -> more static power
Tens of millions static configurations
transistors
Gain in dynamic power
Many hard IP block
DSP slices
Virtex 5 Triple Oxide (contd)
3 oxide Thickness
Thin oxide for high
speed part
large oxide for 3.3v
I/O
midox for static
configuration part
Virtex 5 hard IP block
Rocket IO : Low power serial IO hard IP
bloc
SATA (Serial Ata)
Gigabyte Ethernet
PCI Express
Consume 100mW at 3.2 Gbps
Tri-mode Ethernet MAC (10/100/1000)
References
The Design Warriors Guide to FPGAS
Low Power FPGA Design Techniques for Embedded
Systems(PHD Thesis by Anurag Tiwari )
WWW.XILINX.COM
Architecture of FPGAs and CPLDs: A Tutorial by
Stephen Brown and Jonathan Rose
Department of Electrical and Computer
Engineering University of Toronto
Peter.Nilsson: Slides of Advanced Digital IC Design
Arnaud Taffanel - Peyman Pouyan