Timing Closure
Lecture 7
Jignesh Shah
UCSC Extension, Silicon Valley
Spring 2024
Agenda
• Logistics
• Quick review of lecture 6
• Clock Cross Domain
• Signal Integrity (i.e. Crosstalk)
Logistics
• Grade of all labs and Midterm is posted online
• Answer of Midterm can be viewed online too.
• Any Question?
Review of lecture 6
• Process Voltage Temperate (aka PVT)
• Global Process Corners SS, TT, FF
• Voltage : Vlow Vnom Vhigh
• Temperature : Hot, Cold
• Interconnect Corners
• Rcworst, Cworst, Typcial, Cbest, RCbest
• STA Modes
• Functional, ScanShift, ScanCapture, BIST, low power,
• Timing margins Modes
• Jitter, Aging, IR (aka Delta V) , Temperature Difference (i.e. delta_T) , Unknown
Single Clock Domain Design
• Each Functional block is clocked by the same clock signal hence called
single clock domain.
Single Clock CPU
• Easier for data transfer across functional blocks.
• Under design. Speed of ALU might be slower because IO interfaces
Multiple Clock Domain Design
• Different functional blocks can be run on different frequency using clock
signals from
Multi Clock CPU
• General Processor 1 Ghz
• Memory Controller at 500 Mhz
• Video Processor 333.33
Many IPs in a typical design
Challenge With Multiple Clocks
• Reliable Data transfer between two clock domains.
- Clocks with different frequency , usually from different sources
- Clocks with same frequency but different phase
- Clocks with integer multiple frequency.
(i.e. phase aligned but frequency are different.
Things to consider for Multiple clock domain
• Clock Domain Partitioning: The design needs to be divided into separate clock
domains based on functional and frequency requirements.
• Clock Domain Crossing Techniques: Various techniques are employed to handle
clock domain crossing, such as using synchronizers (e.g., two-stage or multi-stage
synchronizers), FIFOs (First-In-First-Out buffers), handshaking protocols, or
handshake-based synchronizers.
• Timing Constraints: Proper timing constraints need to be defined for each clock
domain and the paths between them. These constraints include setup and hold
times, clock group definition and maximum delay limits. Ensuring that the data
transfer meets these constraints is crucial for reliable operation.
Metastability in Clock Domain Crossing
• Setup / Hold violation between a flop of scr_clk domain and a flop of dest_clk
domain with different frequency could cause functional violation.
Synchronizers to avoid Metastability
• The most common way to prevent metastability is to add one or more
synchronizing flops at clock crossing boundary.
• Higher the number of stages of synchronize better Mean Time Between Failure
(aka MTBF) for reliability. But latency also increased.
• For low frequency use 2 stage synchronizer.
• Strict Placement guideline between flops of synchronizer for better MTBF.
• Avoid combination logic between flops of synchrnozier.
Two flops synchronizer
• Output of first flop of synchronizer could go metastable but it is fed directly to
another flop which is driven by same clock.
• The output second flop is used for normal operation.
Data loss prevention for CDC
• Synchronizer gives reliable data at Clock crossing interface but data can be lost
when launch flop generate multiple data before capturing flop store data.
• STA tool can not identify data-loss. Need to run CDC simulation tool.
• Two popular method to avoid data loss in CDC. STA tool can not ide
* Handshake
* FIFO Memory
Timing Constraint on clock of different Freq
• Clocks with frequency of non-integer multiple are asynchronous with each other
usually.
• Constraint those clocks with
set_false_path –from [get_clock <xclk>] –to [get_clock <yclk>]
set_false_path –from [get_clock <yclk>] –to [get_clock <xclk>]
OR
set_clock_group –asynchronous -group <xclk> -group <yclk> -allow_paths
set_max_delay –from [get_clock xclk] –to [get_clock yclk] (period of capture clock)
set_max_delay –from [get_clock yclk] –to [get_clock xclk] (period of capture clock)
Timing Constraint on Slow to Fast clock
• Clocks with frequency integer multiple are synchronous with each other usually.
• Constraint such clocks with multicycle path
Glitch protection for MCP
• One of the circuit to enable multicycle of 4
• Multicycle path of 2 using synchronizer
Timing Constraint on Fast to Slow clock
• Clocks with frequency integer multiple are synchronous with each other usually.
• Constraint such clocks with multicycle path with less restrictive setup check
Timing Constraint on Phase Shift clocks
• Clock with same frequency but different phase could be synchronous
Half Cycle setup
Half Cycle hold
Different clock sources
• Clocks coming from different PLL types source are asynchronous even though
they have same frequency.
Ref_clock CLKA, 1.0 Ghz
CLKB, 1.0 Ghz
set_clock_group –asynchronous –group { CLKA } -group {CLKB}
• Understand difference among
set_clock_group –asynchronous …
set_clock_group –logical_exclusive …
set_clock_group –physical_exclusive …
Signal Integrity
• For STA, signal integrity is quality measurement of digital signals with its
value in binary form represented by voltage waveform.
• Crosstalk, the unwanted electrical interaction among multiple wires can
impaired the signal integrity of a signal.
Net Capacitance With Different Feature Sizes
•As circuit geometries become smaller, wire
interconnections become closer together and taller, thus
increasing the cross-coupling capacitance between nets.
•At the same time, parasitic capacitance to the substrate
becomes less as interconnections become narrower, and
cell delays are reduced as transistors become smaller.
0.13 micron
0.25 micron
Insulator
Substrate
(ground)
Crosstalk effect
• Delay
• Coupling of switching activity of the victim with the switching activity of the
aggressors impacts cell delay.
• Glitch
• Disturbance (aka noise) caused on a steady victim signal
Cross talk glitch
• A steady signal net can have a glitch due to charge transferred by the
switching aggressors through the coupling capacitances
Fig 1: Glitch in digital circuit
Fig 2: Type of Glitches
• Factors for large magnitude of glitch
• Large coupling capacitance
• Fast slew time on the aggressor
• Smaller victim net grounded capacitance
• Smaller victim net driving strength
Causes of more Crosstalk in lower process nodes
• More number of metal Layer
• Larger aspect ratio (i.e. Tall & Thin Wires) will have more side wall
capacitance which can coupled with other wires
• Density increasing (i.e. More logic packed in same area)
• Faster Frequency
• Lower supply Voltage
DC noise margin
• DC noise limits on the input of a cell while ensuring proper logic levels.
• VIH and VIL are steady state noise limits
VIH = Minimum voltage to interpret High (1) level logic
VIL = Maximum voltage to interpret Low (0) level logic
AC noise limits
• Too narrow or too short glitches are safe.
• AC noise rejection region can be specified through limit in liberty
model.
Cross talk delay
• Aggressor and victim are switching in opposite directions (i.e. + Xtalk)
• Increases the amount of charge required from victim driver
• Victim delay increases
• Aggressor and victim are switching in the same directions (i.e. – Xtalk)
• Charge on Cc remains the same before and after the transitions
• Victim delay reduced
Setup analysis with Crosstalk (aka delta) delay
• Consider the logic shown below where cross talk can occur at various nets
along the data path and along the clock paths
• The setup (or max path) analysis assumes that
• Launch clock path sees positive Xtalk delay so that the data is launched late
• Data path sees positive Xtalk delay so that it takes longer for the data to reach the
destination
• Capture clock path sees negative Xdelay so that the data is captured by the capture flop
early
• Common clock path will have positive Xtalk for launch & negative Xtalk for Caputre.
• No credit for cross talk delay in common path for Setup analysis.
Hold analysis with Crosstalk (aka delta) delay
• The worst case for 0 cycle hold (or min path) analysis for STA with
cross talk assumes
• Launch clock (not including the common path) sees negative cross talk delay
so that the data is launched early
• Data path sees negative cross talk delay so that it reaches the destination
early
• Capture clock (not including the common path) sees positive crosstalk delay
so that the data is captured by the capture flop late.
• For 0 cycle hold (i.e. Launch & Capture edge start at same time) , no crosstalk
delay apply for common clock path.
• For non-Zero cycle hold worst Xtalk delay applied just like setup analysis.
Glitch & Delay computation due to crosstalk
• The SI noise effect is accumulated due to multiple aggressors.
• STA, filter aggressor based on functionality of nets (i.e mode, exception)
• Switching (aka timing) windows of the aggressor nets are used.
• Default aggressor switching window could be infinite. Depend on STA tool
Clock group for Crosstalk (aka delta) delay
• Set_clock_Groups –asynchronous
Clocks with no relationship with for timing paths but crosstalk
can occur among the nets with infinite window.
• Set_clock_Groups –logical_exclusive
Clocks with no logical timing path but check for crosstalk impact
with overlapping window among the nets.
• Set_clock_Groups –physical_exclusive
Clocks with no logical timing path and no crosstalk impact among
the physically isolated nets.
Fix the SI violation
• Increase spacing between victim & aggressor nets.
• Shield nets with VDD or GND
• Upsized Victim Receiver
• Downsize Aggressor Driver
• Apply Non Default Rules (i.e. NDR) of double space & Shielding rules
for clock nets.
SI and Clock Tree: Shielded Routing
1 minimum width wire
used to shield on each
side of the clock wires
only on metals 3 & 4
Clock constrained to
metals 3 & 4
Signal Wire
Metals 1 & 2 used for
pin access with no
shielding
Shielded Routing
Shield Wire Without shield No coupling capacitance
between wires outside
of the shields
Clock Wire
Coupling capacitance
between shield and
clock wire
Gnd/Vdd
With inserted shield
Very predictable
amount of coupling
PrimeTime SI: Cross talk delay
Parasitics Data
• Increase Speed
Filtering • Electrical Filtering • Increase Capacity
• Reduce Complexity
Delay • Arrival Analysis • Improve Accuracy
Calculation • Logical Correlation
• Delta Delay
Generate • Identify Crosstalk
• Victim/Aggressor
Timing Reports Problems
Net Reports
Timing reports with delta delays
Path Group: clock
Path Type: max Delta Delay Column
Point Fanout Delta Incr Path (-crosstalk_delta)
--------------------------------------------------------------
clock clock (rise edge) 0.00 0.00
clock network delay (ideal) 0.00 0.00
hostif_0/host_address_regx31x/CP (FD2S) 0.00 0.00 r
hostif_0/host_address_regx31x/Q (FD2S) 0.47 & 0.47 f
hostif_0/U1569/D (NR4X05) 0.03 0.03 & 0.81 f
hostif_0/U1569/Z (NR4X05) 0.51 & 1.32 r
hostif_0/n5966 (net) 1 Delay Introduced
hostif_0/U1440/F (ND8P) 0.25 0.25 & 1.57 r
by Crosstalk
:
hostif_0/U1994/Z (ND3A) 0.24 & 6.20 r
hostif_0/n6400 (net) 1
hostif_0/taplink_tx_data_regx4x/D (FD2S)
0.02 0.02 & 6.22 r
clock clock (rise edge) 5.00 5.00
clock network delay (ideal) 0.00 5.00
hostif_0/taplink_tx_data_regx4x/CP (FD2S) 5.00 r
library setup time -0.39 4.61
--------------------------------------------------------------
data required time 4.61
data arrival time -6.22
--------------------------------------------------------------
slack (VIOLATED) -1.61
Noise report
report_noise -above -verbose -nosplit
****************************************
Report : noise
-verbose
-above
Design : cpu_core
Version: U-2003.03
Date : Thu Feb 27 09:35:41 2003
****************************************
Victim Net &
slack type: area
Cell Input Pin
noise_region: above_low
pin name (net name) width height slack
----------------------------------------------------- Multiple
rx_snt10g_frm78_prot_1/prot_rx_3/data_out_reg[27]/D (N43563)
Aggressors:
Aggressor Nets
rst_prot_rx78_l_3__INBUF_103
0.80 0.00
rx_snt10g_frm78_prot_1/prot_rx_3/poly_seq[4] Bump Width, Height,
1.09 0.00
rx_snt10g_frm78_prot_1/n30613_6
& Noise Slack
0.47 0.00
Propagated:
rx_snt10g_frm78_prot_1/U32709_C1/Z
0.38 0.87
Total: 0.38 0.87 -0.12
Signals on which glitch must be avoided
• Input of sequential cells or Memory
• Reset and Set signal
• Output ports
Lab #6
• Copy directory “/home/jdshah/spring_2024_labs/lab6
and follow instruction of lab_excercise.txt file
Thank You
Backup Slides
Corner Explosion
Operating modes: functional, scan shift, scan capture, bist, async
FE corners: SS, TT, FF
SSG SSGNP TT FFGNP FF
ΔW ΔT ΔH
Typical typical typical Typical
BE corners: C-worst, Cbest, Typical
C-best min min max
RC-worst, RC-best C-worst max max min
RC-best max max max
RC-worst min min min
Temp corners: cold, hot
Voltage: Vlow, Vnomial, Vhigh
41
Worst case Corner
• Design for worst case
• Usually for setup, the worst delay corner is SS, low Voltage, high and low
temp and Cworst & Rcworst wire corners. (depends on design and process)
• Usually forhold, the minimum delay corner is FF high voltage, high and low
temp and SS low voltage, high or low temp for clock skew dominated path.
• Hold analysis need to be for all wire corners. (i.e. more pessimism required)
• Robust as it covers the process yield distribution
• Increases cost (larger die) and schedule (more difficult to fix setup/hold
violations across SS to FF
FF,Vhigh,hot
FF,Vhighcold
Design & Process
Window
SS,Vlow,hot
SS,Vlow,cold
Clock Jitter margin
• Clock output waveform or edge from source (i.e. PLL) is not exactly
same what specified in STA due to random noise in circuit and power
supply noise in clock distribution.
• Clock edge or period deviation from ideal edge is defined as Jitter.
• Specify Jitter margin through clock uncertainty for both same edge
and opposite edge timing paths.
Noise Immunity Failure Definition
Vih
Voh
Vil Vol
DC transfer curve of an Output voltage bumps at failure
inverter threshold
Noise Immunity Failure Definition (2)
Height
Propagation to output
Vdd
Vih_min
Safe glitches
Potentially hazardous
Vih_max glitches
Vss Potentially hazard