Hardware Engineer's Guide
By Shimi Cohen
A
FPGA Architecture
FPGA FUNDAMENTALS
Core Architecture Components
FPGA CONTAIN 4 ARCHITECTURAL ELEMENTS:
Configurable Logic Blocks (CLBs)
Block RAM (BRAM)
Digital Signal Processing (DSP) blocks
Input/Output Blocks (IOBs)
FPGA RESOURCE DISTRIBUTION
Resource Perc[%] Function
CLBs 60-70 Logic
Routing 20-25 Interconnect
BRAM 8-12 Memory
DSP 3-5 Math Operations
IOBs 2-3 External interfaces
Logic Element Structure
Each CLB contains multiple logic elements.
LOOKUP TABLE (LUT) CONFIGURATION:
• N-Input LUT standard in modern FPGAs
• Can implement any N-Input Boolean function
• configured as 64-bit distributed RAM
• Or as 32-bit shift register
FLIP-FLOP RESOURCES:
• Independent clock enable and reset
• Set/reset can be synchronous or asynchronous
• Clock inversion capability
CARRY CHAIN LOGIC:
• Dedicated carry propagation between LEs
• Enables efficient arithmetic operations
• 1-bit full adder per LE
• Reduced propagation delay vs. general routing
2
FPGA Architecture
Interconnect Architecture
FPGA routing architecture determines performance and utilization efficiency.
ROUTING HIERARCHY LEVELS
Level Length Delay Usage
Local 1-2 CLBs 0.1-0.2ns CLB internal
Intermediate 4-8 CLBs 0.5-1.0ns Regional routing
Global Full device 2-4ns Clock, reset
Long Lines Device span 1-2ns High fanout
SWITCH MATRIX ARCHITECTURE:
Programmable interconnect points (PIPs)
SRAM-controlled pass transistors
Routing channel capacity: 200-400 wires per channel
Connection flexibility: 20-40% of crosspoints
3
FPGA Architecture
POWER SYSTEM DESIGN
Power Domain Requirements
FPGA power systems require multiple voltage domains with specific characteristics.
CORE VOLTAGE DOMAINS
Domain Voltage Tolerance Current Function
VCCCORE 0.85V-1.2V ±3% 5-50A Core logic
VCCDDR 1.8V ±5% 0.5-3A Auxiliary
VCCIO 1.2V-3.3V ±5% 0.1-5A I/O banks
VCCBRAM 0.85V-1.2V ±3% 0.5-5A Block RAM
POWER CONSUMPTION CHARACTERISTICS:
Static power: 20-40% of total (leakage current)
Dynamic power: 60-80% of total (switching activity)
Temperature coefficient: +2%/°C for static power
4
FPGA Architecture
Power Sequencing Implementation
Proper power sequencing prevents latch-up and ensures reliable operation.
RECOMMENDED SEQUENCE:
1. VCCAUX - First power
2. VCCINT - Core voltage
3. VCCBRAM – Concur with VCCINT
4. VCCO - I/O voltages last
TIMING REQUIREMENTS:
• 0.2V/msec to 10V/msec
• 1-100ms between domains
• Simultaneous for same voltages
Power Sequencing Circuit Example:
USING DEDICATED IC:
• TI TPS65381 or AD ADP5138
• Programmable timing delays
• Fault detection and shutdown
• Enable daisy-chaining
DISCRETE IMPLEMENTATION:
• Voltage supervisors
• RC delays for timing
• MOSFET switches (control)
Decoupling Strategy
Effective decoupling prevents power supply noise and voltage droop.
DECOUPLING CAPACITOR SELECTION
Capacitor Type Value Range ESR Frequency Range
Bulk 100-1000µF 10-50mΩ DC-100kHz
Ceramic 0.1-47µF 1-5mΩ 100kHz-100MHz
Small Ceramic 1-100nF <1mΩ 10MHz-1GHz
5
FPGA Architecture
CLOCK ARCHITECTURE
Clock Distribution Networks
FPGA clock networks provide low-skew, low-jitter distribution to sequential elements.
CLOCK NETWORK TYPES
Network Type Skew Jitter Fanout Usage
Global <50ps <20ps 50,000+ System clocks
Regional <100ps <30ps 10,000 Local clocks
I/O <200ps <50ps 500 Interface clocks
CLOCK BUFFER ARCHITECTURE:
H-tree distribution topology
Multiple buffer stages for fanout
Deskew circuits at distribution points
Dedicated routing resources
6
FPGA Architecture
Clock Domain Crossing
Managing signals crossing between different clock domains requires specific techniques.
SYNCHRONIZER CIRCUITS:
Two-flip-flop synchronizer for single bits
Gray code counters for multi-bit values
Handshake protocols for control signals
FIFO buffers for data streams
CRITICAL TIMING CONSIDERATIONS:
Setup/hold violations at domain boundaries
Metastability resolution time
Clock skew between domains
False path timing constraints
PLL Configuration
Phase-Locked Loops generate and condition clock signals within FPGAs.
PLL SPECIFICATIONS:
Input frequency range: 10MHz-800MHz
Output frequency range: 6.25MHz-1.6GHz
Phase resolution: 45-90° steps
Jitter performance: 80-150ps RMS
PLL CONFIGURATION PARAMETERS
Parameter Range Resolution Function
M (Feedback) 2-128 1 Multiplication factor
D (Input) 1-128 1 Input division
N (Output) 1-128 1 Output division
Phase 0-315° 45° Output phase shift
VCO OPERATING RANGE:
Must operate within 600MHz-1.6GHz
Formula: VCO = (Input × M) / D
Output = VCO / N
7
FPGA Architecture
INPUT/OUTPUT BANK
I/O Standards and Termination
FPGA I/O banks support multiple electrical standards with specific requirements.
COMMON I/O STANDARDS
Standard Voltage Termination Speed Application
LVCMOS 1.2V-3.3V None <200MHz General purpose
LVDS Differential 100Ω 1.5Gbps High-speed serial
DDR3 1.5V 60Ω/120Ω 800MHz Memory interface
PCIe Differential 100Ω 8Gbps Serial protocol
TERMINATION REQUIREMENTS:
Series termination at source
Parallel termination at receiver
AC termination for clocks
Differential termination for pairs
High-Speed Interface Design
Interfaces operating above 100MHz require specialized design techniques.
DDR MEMORY INTERFACE:
Differential clocking required
Read calibration for setup/hold
ODT (On-Die Termination) control
SERDES INTERFACE DESIGN:
AC coupling for DC balance
Common-mode termination
Spread spectrum clocking
8
FPGA Architecture
FPGA Configuration
Configuration Memory Types
FPGAs use various memory technologies for configuration storage.
CONFIGURATION MEMORY COMPARISON
Memory Type Capacity Speed Volatility Cost
SRAM High Fast Volatile High
Flash Medium Medium Non-volatile Medium
Antifuse Low N/A Non-volatile Low
SRAM-BASED CONFIGURATION:
Configuration lost at power-down
Requires external memory for boot
Fast reconfiguration capability
Partial reconfiguration support
Multi-Boot Implementation
Multi-boot enables failsafe operation and field updates.
GOLDEN BOOT IMAGE:
Factory-programmed known-good configuration
Fallback option if application image fails
Watchdog timer triggers fallback
Remote update capability
BOOT IMAGE MANAGEMENT:
Image validation using CRC
Image selection via external pins
Automatic fallback on failure
Boot status reporting
9
FPGA Architecture
Boot Configuration Schemes
Multiple boot modes provide flexibility for different applications.
MASTER SERIAL MODE:
FPGA controls configuration sequence
External serial memory (SPI/QSPI)
Simple interface implementation
Lowest pin count requirement
SLAVE SERIAL MODE:
External processor controls configuration
Synchronous serial interface
Processor can verify configuration
Enables custom boot sequences
10
FPGA Architecture
THERMAL AND MECHANICAL
Thermal Analysis Requirements
FPGA thermal management ensures reliable operation and performance.
THERMAL SPECIFICATIONS:
Junction temperature: 0°C to 100°C typical
Ambient temperature: -40°C to 85°C
Thermal shutdown: 125°C
Performance derating above 85°C
Power Dissipation Calculation
STATIC POWER:
Leakage current dependent on temperature
Increases exponentially with temperature
Device-specific leakage specifications
Process variation effects
DYNAMIC POWER:
P = C × V² × f × α
C = total capacitance switched
V = supply voltage
f = switching frequency
α = activity factor (0-1)
Package Selection Criteria
Package selection affects thermal, electrical, and mechanical performance.
PACKAGE COMPARISON
Package Thermal Resistance Pin Count Cost Applications
TQFP 25-40°C/W 144-324 Low Cost-sensitive
BGA 5-15°C/W 256-1760 Medium High-performance
Flip-Chip 2-8°C/W 676-1760 High Maximum performance
THERMAL RESISTANCE COMPONENTS:
Junction-to-case (θJC)
Case-to-ambient (θCA)
Junction-to-ambient (θJA) = θJC + θCA
11
FPGA Architecture
DEBUG AND TESTING
JTAG Chain Implementation
JTAG provides boundary scan and programming capabilities.
JTAG SIGNAL REQUIREMENTS:
TCK: Test clock input
TMS: Test mode select
TDI: Test data input
TDO: Test data output
JTAG CHAIN DESIGN RULES
Parameter Requirement Notes
Clock Frequency 1-50MHz Device dependent
Pull-up Resistors 1-10kΩ TMS, TDI required
Series Termination 22-33Ω High-speed operation
Chain Length <8 devices Signal integrity
On-Chip Debug Resources
Modern FPGAs include embedded debug capabilities.
INTEGRATED LOGIC ANALYZER (ILA):
Real-time signal capture
Programmable trigger conditions
Deep memory buffers
Multiple probe points
CHIPSCOPE/SIGNALTAP FEATURES:
Up to 1024 signals monitored
Trigger patterns and sequences
Data export capabilities
Post-processing analysis
Test Point Strategy
Strategic test points enable effective debug and verification.
CRITICAL SIGNAL MONITORING:
Power supply voltages
Clock signals and frequencies
Reset and enable signals
High-speed differential pairs
TEST POINT PLACEMENT:
Accessible with standard probes
Minimal trace length addition
Ground reference nearby
12
FPGA Architecture
Protection from accidental shorts
FPGA VS CPLD COMPARISON
Architecture Differences
Complex Programmable Logic Devices (CPLDs) and FPGAs serve different roles in digital design,
each with distinct architectural characteristics.
CPLD ARCHITECTURE FUNDAMENTALS:
Product-term based logic - Uses sum-of-products structure rather than LUTs
Macrocell organization - Fixed logic blocks with dedicated flip-flops
Global interconnect - Centralized routing matrix for all connections
Non-volatile configuration - Flash/EEPROM based, no external boot memory needed
FPGA ARCHITECTURE FUNDAMENTALS:
LUT-based logic - Look-up tables provide flexible Boolean function implementation
Hierarchical routing - Multiple levels of interconnect for scalability
Volatile configuration - SRAM-based, requires external boot sequence
Rich memory resources - Embedded block RAM and distributed memory
PERFORMANCE AND CAPACITY COMPARISON
Parameter CPLD FPGA
Logic Capacity 32-512 macrocells 1K-1M+ logic cells
Propagation Delay 5-15ns predictable 1-10ns varies with routing
Power Consumption 10-500mW 100mW-100W+
Configuration Time Instant-on 1-500ms boot time
Cost (relative) $1-50 $10-10,000+
13
FPGA Architecture
Application Guidelines
CHOOSE CPLD WHEN:
Glue logic replacement - Simple combinational functions
State machines - Well-defined sequential control
Instant-on requirement - No boot delay acceptable
Low power critical - Battery-powered applications
Predictable timing - Hard real-time requirements
Small form factor - Limited board space
CHOOSE FPGA WHEN:
Complex algorithms - DSP, image processing, cryptography
High-speed interfaces - multi-gigabit transceivers
Large memory requirements - Embedded processors, data buffering
Reconfigurable systems - Field-updatable functionality
Parallel processing - Multiple concurrent operations
Migration Considerations
When migrating between CPLD and FPGA architectures:
CPLD TO FPGA MIGRATION:
Timing closure - May require constraint adjustments
Power increase - Additional cooling/power supply considerations
Boot sequence - Add configuration memory and sequencing
Cost increase - Higher device and support component costs
FPGA TO CPLD MIGRATION:
Resource limitations - May require architecture simplification
Timing predictability - Better worst-case timing guarantees
Power reduction - Significant power savings possible
Instant-on benefit - Eliminates boot delay
14
FPGA Architecture
REAL-WORLD EXAMPLE
High-Speed Data Acquisition System
SYSTEM REQUIREMENTS:
16-channel analog input
125 MSPS sampling rate per channel
14-bit resolution
10 Gbps Ethernet output
ARCHITECTURE OVERVIEW:
ADC interface using LVDS
DDR3 memory for buffering
Ethernet MAC/PHY interface
DSP processing blocks
PERFORMANCE REQUIREMENTS:
Parameter Specification Design Margin
Sample Rate 125 MSPS × 16 2.0 Gbps input
Data Width 14 bits 16-bit alignment
Memory Bandwidth 4 GB/s DDR3-1600 capable
Network Throughput 10 Gbps Full-duplex Ethernet
RESOURCE UTILIZATION ESTIMATE:
Logic elements: 45,000 (35% utilization)
Memory blocks: 180 (60% utilization)
DSP blocks: 240 (80% utilization)
I/O pins: 400 (70% utilization)
15
FPGA Architecture
Implementation Strategy
CLOCK ARCHITECTURE:
125 MHz ADC sampling clock
200 MHz DDR3 memory clock
156.25 MHz Ethernet reference
POWER SYSTEM DESIGN:
12V input with multiple regulators
1.0V core, 2.5V I/O, 1.5V memory
Sequencing controller with monitoring
Total power budget: 25W
THERMAL MANAGEMENT:
Forced air cooling with 10 CFM
Heat sink with 2°C/W thermal resistance
Junction temperature <80°C at full load
Thermal monitoring and throttling
SIGNAL INTEGRITY CONSIDERATIONS:
Differential ADC interfaces
Length-matched memory traces
Controlled impedance throughout
Power and ground plane integrity
16
FPGA Architecture
LAYOUT CONSIDERATIONS
Layer Stack-up Planning
FPGA PCB design requires careful layer planning to manage high-speed signals, power
distribution, and thermal considerations.
RECOMMENDED LAYER STACK-UPS:
8-Layer FPGA Board:
1. Layer 1: Component placement
2. Layer 2: Ground plane
3. Layer 3: High-speed signal routing
4. Layer 4: Power plane (core voltages)
5. Layer 5: Power plane (I/O voltages)
6. Layer 6: High-speed signal routing
7. Layer 7: Ground plane
8. Layer 8: Component placement
Layer Thickness Specifications:
Core thickness: 0.1-0.2mm between signal and reference planes
Prepreg thickness: 0.1-0.3mm for controlled impedance
Copper weights: 1-2oz for signal layers, 2-4oz for power planes
Via sizes: 0.1-0.2mm for high-density routing
17
FPGA Architecture
High-Speed Routing Guidelines
DIFFERENTIAL PAIR ROUTING:
Impedance matching: ±10% tolerance for differential pairs
Length matching: <0.1mm mismatch within pairs
Spacing control: 3W rule (3× trace width separation)
Via optimization: Minimize via count in critical paths
MEMORY INTERFACE ROUTING:
Address/Command: Point-to-point topology, length-matched groups
Data signals: T-topology acceptable, matched to clock groups
Clock signals: Differential routing with guard traces
Power delivery: Dedicated planes with low impedance
18
FPGA Architecture
COMPONENT SELECTION
Supporting Component Selection
SPI FLASH MEMORY COMPARISON:
Parameter Standard SPI Quad SPI Octal SPI
Interface Width 1-bit 4-bit 8-bit
Read Speed 50-100MHz 100-200MHz 200-400MHz
Boot Time Longest Medium Fastest
Cost Lowest Medium Highest
POWER MANAGEMENT IC SELECTION:
Integrated solutions: TI TPS65381, ADI ADP5138
Discrete solutions: Individual LDOs and switching regulators
Power efficiency: >85% efficiency target for switching regulators
Sequencing accuracy: <50mV overshoot during transitions
19
FPGA Architecture
FPGA COMPARISON
FPGA Comparison Matrix
20