Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
29 views18 pages

CDC Verification

Chapter 8 discusses the challenges of Clock Domain Crossing (CDC) verification in multi-clock domain designs, highlighting issues like metastability and the need for effective synchronizing techniques. It outlines various methods for synchronization, including two-flop and three-flop synchronizers, and emphasizes the importance of SystemVerilog Assertions in ensuring signal integrity during CDC. The chapter also touches on multi-bit synchronization strategies, such as using Gray Code and asynchronous FIFOs to manage data transfer across clock domains.

Uploaded by

minhthanhstudy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views18 pages

CDC Verification

Chapter 8 discusses the challenges of Clock Domain Crossing (CDC) verification in multi-clock domain designs, highlighting issues like metastability and the need for effective synchronizing techniques. It outlines various methods for synchronization, including two-flop and three-flop synchronizers, and emphasizes the importance of SystemVerilog Assertions in ensuring signal integrity during CDC. The chapter also touches on multi-bit synchronization strategies, such as using Gray Code and asynchronous FIFOs to manage data transfer across clock domains.

Uploaded by

minhthanhstudy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Chapter 8

Clock Domain Crossing (CDC) Verification

Chapter Introduction
Clock domain crossing (CDC) has become an ever-increasing problem in multi-­clock
domain designs. One must solve issues not only at RTL level but also consider the
physical timing. This chapter will start with understanding of metastability and then
dive into different synchronizing techniques. It will also discuss the role of SystemVerilog
Assertions in verification of CDC. We will then discuss a complete methodology.

8.1 Design Complexity and CDC

There are hardly any designs today that operate on a single clock. A typical SoC will
have three or more clocks, and these will be asynchronous. We have all done CDC
checks using lint tools, among others. But the problem is that there is a disconnect
between RTL static or simulation-based analysis and what we see in the physical
chip. The issue of metastability due to clock domain crossing is not very predictable
at RTL or gate level. Therefore, simulation does not accurately predict silicon
behavior, and critical bugs may escape the verification process. This results in
almost 25% of all respins due to clocking issues, CDC being the chief among them.
Here’s an example of typical real-life designs and the number of clocks and CDC
signals they have. This is just a representative data point [ (PING YEUNG PH.D.)].

Design type Number of clock domains Number of CDC signals


Gigabit Ethernet interface 24–28 ~ 11,000
Graphics application 36–42 ~18,000
Multimedia SoC 4 ~54
Wireless 8 ~365

This table goes to show the complexity of CDC verification. Both single-bit and
multi-bit synchronizations need to take place.

© Springer International Publishing AG 2018 149


A.B. Mehta, ASIC/SoC Functional Design Verification,
DOI 10.1007/978-3-319-59418-7_8
150 8 Clock Domain Crossing (CDC) Verification

Fig. 8.1 Clock domain crossing—metastability

8.2 Metastability

The main culprit in CDC is the metastability of data that occurs when data crosses
from one clock domain to another. The first can be slower or faster compared to the
other clock domain. The data that crosses the boundary can end up violating setup/
hold requirements of the second clock domain. This is explained via Fig. 8.1. This
figure shows a synchronization failure that occurs when a TxData generated in
TxClk clock domain is sampled too close (setup violation) to the rising edge of
RxClk of the Rx logic domain. Synchronization failure is caused by an output going
metastable and not converging to a legal stable state.
When TxData violates setup time of the RxClk, RxData goes metastable, mean-
ing we don’t know what state will it settle down to or settle down at all within one
clock. If TxData is held “long” enough, RxData will eventually become stable and
end up in a correct state. For the sake of simplicity, I’ve shown the metastable
RxData to stabilize in one clock. But that may not necessarily be the case in all
instances. If the metastable RxData is fed directly into the forward logic, you do not
8.3 Synchronizer 151

Fig. 8.2 Clock domain crossing—two-flop single-bit synchronizer

know what metastable state got propagated to the forward logic. Since the CDC
signal can fluctuate for some period of time, the input logic in the receiving clock
domain might recognize the logic level of the fluctuating signal to be different val-
ues and hence propagate erroneous signals into the receiving clock domain. In RTL
simulation, this metastable state will be regarded as “X” (unknown) state (correctly
so), and the logic beyond RxDFF may be rendered useless (i.e., “X” propagation
will cause all sorts of issues in the logic).
In short, synchronization failure is caused by an output going metastable and not
converging to a legal stable state by the time the output must be sampled again.

8.3 Synchronizer

8.3.1  wo-Flop Synchronizer (Identical Transmit and Receive


T
Clock Frequencies)

A synchronizer is a device that samples an asynchronous signal and outputs a ver-


sion of the signal that has transitions synchronized to a sample clock.
The simplest synchronizer used in designs is a two-flop synchronizer (Fig. 8.2).
The idea is that the first flop on the transmit side samples data input on the first
flop’s clock (let’s call this the transmit clock). The first flop on the receiving clock
can be very close to the transmit clock. In this case, the output of the transmit clock
flop when captured by the receiving clock will output (at the output of the receive
flop) a metastable signal, because the data output of the transmit flop violated the
setup/hold requirement of the receive flop. If you let the receive flop output propa-
gate to the design, the results will be unpredictable, because this output can be a “1”
or a “0”; you don’t know.
But if you insert a second flop in the receiving circuit, the metastable signal out-
put of the first flop of the receive clock will have time (one clock’s worth) to stabi-
lize before being latched into the second flop on receive side. Now, the output of this
152 8 Clock Domain Crossing (CDC) Verification

Fig. 8.3 Clock domain crossing—synchronizer—waveform

second flop will have a stable value and can propagate to the rest of the design
without unpredictability. Please refer to Fig. 8.3 to understand this scenario. To reit-
erate, the first flip-flop samples the asynchronous input signal into the new clock
domain and waits for a full clock cycle to permit any metastability on the stage-1
output signal to decay, and then the stage-1 signal is sampled by the same clock into
a second-stage flip-flop, with the intended goal that the stage-2 signal is now a sta-
ble and valid signal synchronized and ready for distribution within the new clock
domain.
A couple of implementation guidelines for the two-flop synchronizer:
1. There should not be any combinational logic between the Transmit DFF and the
Receive DFF. This allows for maximum metastability resolution time.
2. RxDFF1 and RxDFF2 synchronizer flops should be placed as close as possible
during layout. Most companies nowadays offer a predefined, laid out, and veri-
fied synchronizer macros which can be hand placed in RTL.

8.3.2 Three-Flop Synchronizer (High-Speed Designs)

For some very high-speed designs, the mean time between failure (MTBF) is too
short since the data may change before the second flop synchronizes the TxData. In
such cases, you may need three-flop synchronizers to compensate for the high
speed. Metastability may not settle down at RxDFF2 (Rx2Data) and hence the need
for the third flop (RxDFF3) (Fig. 8.4).
8.3 Synchronizer 153

TxData Rx1Data Rx2Data RxData

TxDFF RxDFF1 RxDFF2 RxDFF3

TxClk

RxClk Rx Logic

Fig. 8.4 Three-flop single-bit synchronizer

Fig. 8.5 Two-flop single-bit synchronizer

8.3.3  ynchronizing Fast-Clock (Transmit) into Slow-Clock


S
(Receive) Domains

So far, we have seen synchronizers that work when both the transmit and the receive
clocks are of the same frequency. Note that if the transmit clock is slower than the
receive clock, the two (or three) flop synchronizers will work quite well. Recognizing
that sampling slower signals into faster-clock domains causes fewer potential prob-
lems than sampling faster signals into slower-clock domains, a designer might want
to take advantage of this fact by using simple two flip-flop synchronizers to pass
single CDC signals between clock domains.
But when the transmit clock is faster than the receive clock, there is the possibil-
ity that a signal from the transmit logic may change values twice before it can be
sampled or might be too close to the sampling edge of the slower receive clock
domain.
For the ensuing discussion, let us call the signal that needs synchronization as the
CDC signal. That will make it easier to describe the concept. Here’s the two-flop
synchronization (Fig. 8.5) for ease of reference.
154 8 Clock Domain Crossing (CDC) Verification

TxClk

TxData

RxClk

Rx1Data

RxData

Fig. 8.6 Faster transmit clock to slower receive clock—two-flop synchronizer won't work

TxClk

TxData

RxClk

Rx1Data

RxData

Fig. 8.7 Lengthened transmit pulse for correct capture in receive clock domain

If the CDC signal is only pulsed for one fast-clock cycle, the CDC signal could
go high and low between the rising edges of a slower clock and not be captured into
the slower-clock domain. This is shown in Fig. 8.6. In this figure, TxData goes high
and then goes low (1 high pulse) in between the RxClk period. In other words, this
high pulse will not be captured by the RxClk. That results into the Rx1Data remain-
ing at the previously captured state of “0” and so does RxData. The high pulse on
TxData is dropped by the receive logic which will result in incorrect behavior in the
receive logic.
Hence, a two-flop synchronizer won’t work when the transmit clock is faster
than the receive clock.
One potential solution to this problem is to assert the TxData signal (i.e., the
CDC signal) for a period that exceeds the cycle time of the receive clock. This is
shown in Fig. 8.7. The general rule of thumb is that the minimum pulse width of the
transmit signal be 1.5x the period of the receive clock frequency. The assumption is
8.3 Synchronizer 155

that the CDC signal will be sampled at least once by the receive clock. The issue
with this solution will arise if an engineer mistakes this solution to be a general-­
purpose solution and miss the transmit (CDC) signal period requirement. This is
where SystemVerilog Assertions come into picture. Put an assertion on the CDC
signal for its period check when crossing from the high-frequency to the low-­
frequency domain.
There are other solutions to tackle this problem, which are beyond the scope of
this book.

8.3.4 Multi-bit Synchronization

When passing multiple signals between clock domains, simple synchronizers do not
guarantee safe delivery of the data. A frequent mistake made by engineers when
working on multi-clock designs is passing multiple CDC bits required in the same
transaction from one clock domain to another and overlooking the importance of the
synchronized sampling of the CDC bits.
The problem is that multiple signals that are synchronized to one clock will
experience small data-changing skews that can occasionally be sampled on different
rising clock edges in a second clock domain. Even if we could perfectly control and
match the trace lengths of the multiple signals, differences in rise and fall times as
well as process variations across a die could introduce enough skew to cause sam-
pling failures on otherwise carefully matched traces.
Here are a couple of solutions to solve the multi-bit synchronization problem.
In-depth discussion of these solutions is out of scope of this book, but I highly rec-
ommend a SNUG paper by Cliff Cummings mentioned in the Bibliography (Clifford
E. Cummings).
1. The Gray Code Solution Where Multiple CDC Bits Are Passed Using Gray
Codes
The safest counters that can be used in multi-clock designs are Gray Code coun-
ters. Gray Codes only allow one bit to change for each clock transition, eliminating
the problem associated with trying to synchronize multiple changing CDC bits
across a clock domain. Standard Gray Codes have very nice translation properties
to convert gray to binary and back again. Using these conversions, it is simple to
design efficient Gray Code counters.
I am sure we are familiar with Binary to Gray and Gray to Binary code conver-
sion formulas. But they are presented here for the sake of completeness.
4-bit Gray to Binary conversion:

binary [0] = gray[3] ^ gray[2] ^ gray[1] ^ gray[0];


binary [1] = gray[3] ^ gray[2] ^ gray[1];
binary [2] = gray[3] ^ gray[2];
binary [3] = gray[3];
156 8 Clock Domain Crossing (CDC) Verification

This can also be represented as:

binary [0] = gray[3] ^ gray[2] ^ gray[1] ^ gray[0] ; // gray>>0


binary [1] = 1'b0 ^ gray[3] ^ gray[2] ^ gray[1] ; // gray>>1
binary [2] = 1'b0 ^ 1'b0 ^ gray[3] ^ gray[2] ; // gray>>2
binary [3] = 1'b0 ^ 1'b0 ^ 1'b0 ^ gray[3] ; // gray>>3

And here’s the Binary to Gray conversion:

gray[0] = binary[0] ^ binary [1];


gray[1] = binary [1] ^ binary [2];
gray[2] = binary [2] ^ binary [3];
gray[3] = binary [3] ^ 1'b0 ;

2. Asynchronous FIFO Implementation


Passing multiple bits, whether data bits or control bits, can be done through an
asynchronous FIFO. An asynchronous FIFO is a shared memory or register buffer
where data is inserted from the write clock domain and data is removed from the
read clock domain. Since both sender and receiver operate within their own respec-
tive clock domains, using a dual-port buffer, such as a FIFO, is a safe way to pass
multi-bit values between clock domains. A standard asynchronous FIFO device
allows multiple data or control words to be inserted if the FIFO is not full. The
receiver can then extract multiple data or control words if the FIFO is not empty.

8.3.5 Design of an Asynchronous FIFO Using Gray Code Counters

The Gray Code counters are used in this asynchronous FIFO design for the Read_
pointer and the Write_pointer guaranteeing successful transfer of multi-bit data
from write clock (aka the transmit clock) to read clock (aka the receive clock). Let
us look at an asynchronous FIFO design that uses Gray Code counter.

module asynchronous_fifo (
// Outputs
fifo_out, full, empty,
// Inputs
wclk, wclk_reset_n, write_en,
rclk, rclk_reset_n, read_en,
fifo_in
);

`define FF_DLY 1’b1


parameter D_WIDTH = 20;
parameter D_DEPTH = 4;
parameter A_WIDTH = 2;
8.3 Synchronizer 157

input wclk_reset_n;
input rclk_reset_n;
input wclk;
input rclk;
input write_en;
input read_en;
input [D_WIDTH-1:0] fifo_in;

output [D_WIDTH-1:0] fifo_out;


output full;
output empty;

reg [D_WIDTH-1:0] reg_mem[0:D_DEPTH-1];


reg [A_WIDTH:0] wr_ptr;
reg [A_WIDTH:0] wr_ptr_gray;
reg [A_WIDTH:0] wr_ptr_gray_rclk_q;
reg [A_WIDTH:0] wr_ptr_gray_rclk_q2;
reg [A_WIDTH:0] rd_ptr;
reg [A_WIDTH:0] rd_ptr_gray;
reg [A_WIDTH:0] rd_ptr_gray_wclk_q;
reg [A_WIDTH:0] rd_ptr_gray_wclk_q2;

reg full;
reg empty;

wire [A_WIDTH:0] nxt_wr_ptr;


wire [A_WIDTH:0] nxt_rd_ptr;
wire [A_WIDTH:0] nxt_wr_ptr_gray;
wire [A_WIDTH:0] nxt_rd_ptr_gray;
wire [A_WIDTH-1:0] wr_addr;
wire [A_WIDTH-1:0] rd_addr;
wire full_d;
wire empty_d;

assign wr_addr = wr_ptr[A_WIDTH-1:0];


assign rd_addr = rd_ptr[A_WIDTH-1:0];

always @ (posedge wclk)


if (write_en) reg_mem[wr_addr] <= #`FF_DLY fifo_in;

assign fifo_out = reg_mem[rd_addr];

always @ (posedge wclk or negedge wclk_reset_n)


if (!wclk_reset_n) begin
wr_ptr <= #`FF_DLY {A_WIDTH+1{1'b0}};
158 8 Clock Domain Crossing (CDC) Verification

wr_ptr_gray <= #`FF_DLY {A_WIDTH+1{1'b0}};


end else begin
wr_ptr <= #`FF_DLY nxt_wr_ptr;
wr_ptr_gray <= #`FF_DLY nxt_wr_ptr_gray;
end

assign nxt_wr_ptr = (write_en) ? wr_ptr+1 : wr_ptr;


assign nxt_wr_ptr_gray = ((nxt_wr_ptr>>1) ^ nxt_wr_ptr);

always @ (posedge rclk or negedge rclk_reset_n)


if (!rclk_reset_n) begin
rd_ptr <= #`FF_DLY {A_WIDTH+1{1'b0}};
rd_ptr_gray <= #`FF_DLY {A_WIDTH+1{1'b0}};
end else begin
rd_ptr <= #`FF_DLY nxt_rd_ptr;
rd_ptr_gray <= #`FF_DLY nxt_rd_ptr_gray;
end

assign nxt_rd_ptr = (read_en) ? rd_ptr+1 : rd_ptr;


assign nxt_rd_ptr_gray = (nxt_rd_ptr>>1) ^ nxt_rd_ptr;

// check full
always @ (posedge wclk or negedge wclk_reset_n)
if (!wclk_reset_n)
{rd_ptr_gray_wclk_q2, rd_ptr_gray_wclk_q} <= #`FF_DLY {{A_
WIDTH+1{1'b0}}, {A_WIDTH+1{1'b0}}};
else
{rd_ptr_gray_wclk_q2, rd_ptr_gray_wclk_q} <= #`FF_DLY {rd_
ptr_gray_wclk_q, rd_ptr_gray};

assign full_d = (nxt_wr_ptr_gray == {~rd_ptr_gray_wclk_q2[A_


WIDTH:A_WIDTH-1], rd_ptr_gray_wclk_q2[A_WIDTH-2:0]});

always @ (posedge wclk or negedge wclk_reset_n)


if (!wclk_reset_n)
full <= #`FF_DLY 1'b0;
else
full <= #`FF_DLY full_d;

// check empty
always @ (posedge rclk or negedge rclk_reset_n)
if (!rclk_reset_n)
{wr_ptr_gray_rclk_q2, wr_ptr_gray_rclk_q} <= #`FF_DLY {{A_
WIDTH+1{1'b0}}, {A_WIDTH+1{1'b0}}};
else
{wr_ptr_gray_rclk_q2, wr_ptr_gray_rclk_q} <= #`FF_DLY {wr_ptr_
8.4 CDC Checks Using SystemVerilog Assertions 159

gray_rclk_q, wr_ptr_gray};

assign empty_d = (nxt_rd_ptr_gray == wr_ptr_gray_rclk_q2);

always @ (posedge rclk or negedge rclk_reset_n)


if (!rclk_reset_n)
empty <= #`FF_DLY 1'b1;
else
empty <= #`FF_DLY empty_d;

endmodule

In the next section, we will see how to use SystemVerilog Assertions to make
sure that data are not dropped when write data (on write clock) are transferred
through Gray Code counter synchronization logic to read data (on read clock).

8.4 CDC Checks Using SystemVerilog Assertions

As we saw, in Chap. 6, SystemVerilog Assertions (SVA) are a great way to check for
sequential domain conditions at clock (or sampling edge) boundaries. The CDC
signals crossing from one clock domain to another are perfect candidates to check
for using SVA. SVA fully supports multi-clock domain assertions as well as multi-­
threaded local variables to make full proof checkers to see that your CDC synchro-
nizers (whatever the design style) work as promised. Note that the assertions
presented here can be used both for simulation-based checking and formal-based
checking (static functional). But I will focus on simulation-based checking since the
formal/static functional is still not fully adopted by many engineering groups and
requires a complete chapter in itself.
Let us start with the simplest of the design. Later we will see a comprehensive
assertion for CDC multi-bit data transfer using the Gray Code counter-based asyn-
chronous FIFO described above.
Here’s a wonderful two-flop synchronizer repeated for the sake of convenience.
160 8 Clock Domain Crossing (CDC) Verification

For any synchronizer design, there will be assumptions on TxData stability.


Should it remain stable for two clocks? Three clocks? This is to make sure that the
CDC signal Rx1Data has enough time to filter the metastability region and pass the
correct value to RxData (the output). Let us go with the assumption that TxData
should remain stable for two clocks every time it assumes a new value (i.e., it
changes). This assumption is required since we assume that TxClk is faster than
RxClk. Refer to Fig. 8.7 for the timing diagram of this design.
Here’s a simple assertion to check for TxData stability:

property TxData_stable;
@(posedge Txclk) $changed(TxData) |=> $stable(TxData) [*2];
endproperty

assert property (TxData_stable);

Let us now see how to make sure that this two-flop single-bit syn-
chronizer correctly transfers data so that RxData === TxData after
metastability filter:

property Tx_to_Rx_CDC_DataCheck;
local Data;

@(posedge Txclk) ($changed(TxData)) |=>


(1’b1, (Data = TxData)) ##1
@(posedge RxClk) (Rx1Data === Data) ##1 (Rxdata === Data);
endproperty: Tx_to_Rx_CDC_DataCheck

assert property (Tx_to_Rx_CDC_DataCheck);

First, the assertion checks that TxData has changed at posedge of TxClk. If it has,
we first store the TxData into the multi-threaded local variable Data. 1’b1 is required
because local data store must be attached to an expression. Since we don’t have any
condition, we simply say “always true” is the expression. “Always true” means
always store TxData into the data, whenever TxData changes. Then, we check at the
CDC boundary clock RxClk that the data has indeed transferred to Rx1Data by
comparing Rx1Data with the stored TxData (in the data). One clock later, the
RxData must match the TxData that was transmitted on TxClk. This guarantees that
the CDC 1-bit two-flop synchronization works as intended. Again, note that the
assumption of TxClk faster than RxClk must be adhered to.
As an exercise, see if you can write a simple assertion to check for glitch on
TxData. The above solution assumes no glitch on TxData.
Ok, now let us write a comprehensive assertion for a multi-bit Gray Code
counter-­based data transfer across CDC region. This assertion is written for the
asynchronous FIFO design shown in Sect. 3.5. The write data are written to fifo_in
on wclk (write clock); and read from fifo_out on rclk (read clk). The assertion has
to make sure that whatever data were written into FIFO at the write pointer, the
same data is read out from FIFO when read pointer is equal to the write pointer:
8.5 CDC Verification Methodology 161

sequence rd_detect(ptr);
##[0:$] (read_en && !empty && (aff1.rd_ptr == ptr));
endsequence

property data_check(wrptr);
integer ptr, data;
@ (posedge wclk) disable iff (!wclk_reset_n || !rclk_reset_n)
(write_en && !full, ptr=wrptr, data=fifo_in,
$display($stime,"\t Assertion Disp wr_ptr=%h data=%h”, aff1.
wr_ptr, fifo_in))

|=>
@ (negedge rclk) first_match(rd_detect(ptr),
$display($stime,,," Assertion Disp FIRST_MATCH ptr=%h Compare
data=%h fifo_out=%h", ptr, data, fifo_out))
##0 (fifo_out === data);
endproperty

dcheck : assert property (data_check(aff1.wr_ptr))


else$display($stime,,,"FAIL: DATA CHECK");
dcheckc : cover property (data_check(aff1.wr_ptr))
$display($stime,,,"PASS: DATA CHECK");

In this assertion, data_check property checks to see if FIFO is not full. If so, saves
wr_ptr into the local variable “ptr” and the data from FIFO into local variable “data” and
display so that we can easily see how the assertion is progressing during simulation.
If the antecedent is true, the consequent says that the first match of rd_ptr being
the same as wr_ptr (note wr_ptr was stored in local variable ptr) and that the read
data is the same as the write data (note write data were stored in local variable data
in the antecedent).
Sequence rd_detect(ptr) is used as an expression to first_match. It says that wait
from now until forever until you detect a read, and its rd_ptr is equal to the wr_ptr
(which is stored in the local variable “ptr” in the antecedent).
Many such assertions can be written to see that your synchronizer design works.
As an exercise, try writing simple assertions for your synchronizer design.

8.5 CDC Verification Methodology

Metastability from the intermixing of multiple clock signals is not accurately mod-
eled by simulation. Unless you leverage exhaustive, automated clock domain cross-
ing (CDC) analyses to identify and correct problem areas, you will inevitably suffer
unpredictable behavior when the chip samples come back from the fab. Bottom line:
automated CDC verification solutions are mandatory for multi-clock designs.
162 8 Clock Domain Crossing (CDC) Verification

Traditional CDC verification methods include manually inspecting RTL code for
the presence of synchronizers, running full timing simulations, sweeping clocks
against each other, and using special simulation models to randomly vary the delays
through synchronizers. These methods find only a subset of errors in a given design.
An effective CDC verification methodology should include structural, protocol,
and re-convergence fanout verification [ (PING YEUNG PH.D.)].
Structural Verification
Each synchronizer must have the correct structure for the type of signal being sent
across clock domains. For example, a 2-DFF synchronizer is usually the best solu-
tion for single-bit signals but should not be used for multi-bit signals unless they are
gray-coded to ensure that only one bit changes at a time. Multi-bit signals may be
synchronized across domains using a separate control signal, an asynchronous
FIFO, or other methods. Also, there should be no combinational logic inside or
before a synchronizer.
Protocol Verification
Each synchronizer must follow a set of rules, called a transfer protocol, to ensure
that the CDC signal is properly transferred across clock domains. For example, even
the simplest 2-DFF synchronizer requires that the transmitting signal be held stable
long enough to guarantee that it is captured in the receiving domain. This may not
occur if the transmitting clock is faster than the receiving clock. Synchronization
structures for multi-bit signals require more complex protocol checks. When CDC
transfer protocols are violated, an error may not occur in simulation but will eventu-
ally occur in real hardware. Protocol analysis should be done using static formal
methods. SVA should be deployed to check for correct protocol adherence.
Re-convergence Fanout Verification (Fig. 8.8)
Re-convergence occurs when multiple signals are synchronized separately from one
clock domain to another and then used by the same logic in the receiving domain. If
that logic assumes a timing relationship between the signals, the design is not
tolerant of metastability and will eventually fail. This is because the purpose of
synchronizers is to “filter out” metastability to ensure that unpredictable values are
not seen by the receiving logic.

Fig. 8.8 Clock domain crossing—re-convergent fanout and CDC


8.5 CDC Verification Methodology 163

STRUCTURAL analysis
RTL Sta c Formal

Protocol
Asser ons Result Debug
Database

Test Vectors + PROTOCAL analysis


Metastability Sta c Formal +
injec on Simula on

Fig. 8.9 Clock domain crossing—automated methodology

Let us see how we can combine structural analysis with protocol analysis to
come up with an automated comprehensive methodology. The following is a generic
diagram representing the automated process many EDA vendors now provide.

8.5.1 Automated CDC Verification

Figure 8.9 shows a proposed methodology. EDA vendors have implemented similar
methodology (or are working toward).

8.5.2 Step 1: Structural Verification

Identify RTL blocks (not the entire SoC RTL) that have CDC signals at play. Feed
such RTL blocks to the static formal structural analysis tool. This tool will identify
CDC synchronization “structures” within your logic and analyze to see if they meet
the requirements. For example, a single-bit CDC synchronization will work with a
two-flop synchronizer. But for a multi-bit synchronizer, the two-flop solution won’t
work. You may need an asynchronous FIFO-based solution or a gray counter (where
only 1 bit changes at a time). The tool will analyze such situations and provide a
structural analysis report. The results are also stored in a UCDB style database for
further debug analysis. This step should find issues with missing and incorrectly
implemented synchronizers and potential re-convergence problems.
More important in this step is to automate derivation of SystemVerilog Assertions.
For example, for a two-flop synchronizer, the input data should remain stable for at
least 1.5x the receive clock. The structural analysis tool will (should) automatically
write such assertions for the next stage of protocol verification. There are many such
164 8 Clock Domain Crossing (CDC) Verification

constraints/checks that need to be provided to the protocol analyzer. The structural


analysis tool “knows” what type of structure has been designed and thereby should
be able to create assertions for protocols that the structure needs to adhere to.
You need to evaluate the structural analysis results provided by an EDA vendor
tool and either accept the recommendation or reject them and implement the best
structure that you envision. Don’t worry; the protocol analyzer will grab you if your
structure does not meet synchronization protocol requirements.

8.5.3 Step 2: Protocol Verification

Once the structural analysis is complete, the assertions (either automatically created
or manually) will be input to the protocol analyzer. The static formal method employed
in the protocol analyzer will try all possible combinations of inputs (both in temporal
and combinational domain) to the RTL block and see if any of the assertions
FAIL. These assertions ensure that the CDC signal is stable when going from the TX
to the RX domain; the multiple-bit CDC data is gray coded, or it is stable when it is
sampled. The results will show failures which need to be analyzed to correct the syn-
chronizer. Multiple iterations of this step will make sure that the logic will survive
under all conditions of input and that the metastability has been addressed.
In addition to static formal, you may also want to simulate using the created asser-
tions. For example, you feel comfortable with sweeping clocks to check for re-conver-
gence logic. Or you want to deploy the so-called static + simulation hybrid methodology
to check for the structural integrity against required protocol specification.

8.5.4 Step 3: Debug

Of course, debug is a big part of this strategy. The results from structural and proto-
col analysis are stored in an UCDB style database. The debug tool will associate the
structure against the protocol and show the relationship. It will also help you debug
failing assertions. EDA tools do support such debug capabilities.
Based on the debug results, you will either change the RTL or change the input
test vectors and metastability injection strategy.
This loop will continue until there are no more assertions that fire and the meta-
stability issues are completely resolved.
This is what I call a proposed methodology. You may discuss it with EDA ven-
dors to see how close they come to it with their proposed solution.

8.6 CDC Verification at Gate Level

The next problem is CDC at gate level. Gate-level simulations are notorious in
propagating an unknown “X,” rapidly throughout the design. The two-flop synchro-
nizer can cause the “X” propagation problem. See Fig. 8.10 to understand this issue.
8.7 EDA Vendors and CDC Tools Support 165

TxClk

RxClk

TxData
Setup Violation
Rx1Data Metastable Region going ‘X’

Stable Regions going ‘X’


RxData
Rest of the logic going ‘X’
Design Logic

Fig. 8.10 Gate-level CDC

When doing gate-level simulations on a multi-clock design [ (Clifford


E. Cummings)], the ASIC library models of flip-flops are modeled with setup and
hold time expressions to match the timing specifications of the actual flip-flops.
ASIC libraries typically model flip-flops to drive X's (unknowns) on the flip-flop
outputs when a timing violation occurs. When simulating gate-level synchronizers,
setup and hold time violations might cause ASIC libraries to issue setup and hold
time error messages, and the offending signals are frequently driven to an X value.
These X values propagate to the rest of the design causing problems when trying to
verify the functionality of the entire gate-level design (Fig. 8.10).
There are many techniques available to turn “OFF” such “X” propagation when
doing CDC verification at gate level. If you are familiar with SystemVerilog and
EDA simulators, you may be familiar with these techniques. But for the sake of
completeness, here they are.
Change the flop setup and hold times to 0. This will obviously not give any tim-
ing violation and hence prevent “X” from being generated because of timing viola-
tion. BUT this will basically nullify the setup/hold of “all” the flops in your
vendor-provided library. So this may not be a good strategy after all.
Turn OFF the timing check in the “specify” block of the flop cells. Many vendors
provide a command line option “+no_notifier” to automatically turn OFF “X” gen-
eration due to a timing violation. I believe this is a preferred methodology, since you
will indeed get a timing violation telling you that there is a synchronization issue
but will not generate an “X.”

8.7 EDA Vendors and CDC Tools Support

So what kind of industry tools are available to help a DV engineer tackle CDC veri-
fication? Synopsys SpyGlass CDC and Mentor’s Questa CDC are two of the many
tools available in the EDA market. I’ve described only Mentor’s solution. Synopsys
does not provide information on their SpyGlass CDC tool unless you register. So do
not go there.
166 8 Clock Domain Crossing (CDC) Verification

8.7.1 Mentor

Here’s a brief description of Mentor’s Questa CDC methodology (Fig. 8.11) [


(MentorGraphics)]. The following description is taken directly from Mentor’s mar-
keting literature. Make sure that their claims are indeed valid!
Questa® CDC identifies errors using structural analysis to recognize clock
domains, synchronizers, and low-power structures via the Unified Power Format
(UPF). It generates assertions for protocol verification along with metastability
models for re-convergence verification. All properties and design intent are inferred
by the software.
The technology checks all potential CDC failures, statically verifying that all
signals crossing asynchronous clock domain boundaries are guarded by CDC syn-
chronizers. It then illustrates DUT issues found with familiar schematic and wave-
form displays. Additionally, in concert with Questa simulation, the CDC-FX app
injects metastability into RTL functional simulation to verify if the DUT is tolerant
of random delays caused by metastability.
Low-power intent awareness—Questa CDC accepts your UPF file without mod-
ification to ensure low-power circuitry does not introduce CDC-related issues.
Specifically, Questa CDC considers all isolation and retention cells, power domains
with dynamic voltage, and frequency scaling (DVFS) and verifies voltage domain
crossing (VDC) paths.

Fig. 8.11 Mentor Questa CDC methodology

You might also like