COM-1209ASOFT HIGH-SPEED
DVB-S2 BCH CODE DECODER & ENCODER
VHDL SOURCE CODE OVERVIEW
Overview
High-speed BCH block code encoder and
decoder for FPGAs
Fully compliant with the DVB-S2 standard
ETSI EN 302 307
Parallel I/Os and processing for high-speed
operation:
o
1 Gbits/s encoding [Virtex-5]
650 Mbits/s encoding [Spartan-3]
up to 950 Mbits/s decoding
[Virtex-5]
up to 400 Mbits/s decoding
[Spartan-3]
Corrects t = 8, 10 or 12 errors per block.
Decoder flags frames with uncorrectable
errors.
Decoder reports number of bit errors
corrected at the end of each decoded block.
o
Clear To Send information from the stream
recipient. The data source must immediately stop
sending data when the data sink clears this signal.
All inputs and outputs are synchronous with the
rising edge of the synchronous clock CLK.
Speed
FPGA
Clock
(max)
Spartan-3
Virtex-5
83 MHz
131 MHz
A minimum guard time of at least (Nbch Kbch)/8 +
2 clocks must be inserted between successive input
frames to let the encoder send the parity bits to its
output. More generally, the data source should
check the flow control signal
SAMPLE_CLK_IN_REQ before sending any input
data to the encoder.
Device Utilization Summary
Encoder
I/Os
Outputs
Inputs
CLK
SYNC_RESET
DATA_IN[7:0]
SAMPLE_CLK_IN
SOF_IN
SAMPLE_CLK_IN_REQ
DATA_OUT[7:0]
SAMPLE_CLK_OUT
SOF_OUT
SAMPLE_CLK_OUT_REQ
Controls
CONTROL[1:0]
KBCH[15:0]
8-bit parallel data input and output help maximize
the throughput. The first byte in the stream is
marked by a Start Of Frame (SOF) flag.
Encoder
output
data rate
(max)
650 Mbits/s
1 Gbits/s
Device: Xilinx Spartan-3
Number of slices
Flip Flops
4 input LUTs
RAMB16
18x18 multiplier
GCLKs
672
258
1268
0
0
1
Device: Xilinx Virtex-5
Number of slices
Flip Flops
LUTs
RAMB
DSP
GCLKs
482
265
964
0
0
1
Flow control is ensured through the
SAMPLE_CLK_x_REQ signals which convey
MSS 18221-A Flower Hill Way Gaithersburg, Maryland 20879 U.S.A.
Telephone: (240) 631-1111 Facsimile: (240) 631-1676 www.ComBlock.com
MSS 2000-2009 Issued 8/27/2009
for the (51840,51648,12) code is 51840/8 + (16.5 +
328)*12 = 10614 clocks per frame.
Decoder
I/Os
Inputs
Outputs
CLK
SYNC_RESET
DATA_IN[7:0]
SAMPLE_CLK_IN
SOF_IN
SAMPLE_CLK_IN_REQ
DATA_OUT[7:0]
SAMPLE_CLK_OUT
SOF_OUT
EOF_OUT
SAMPLE_CLK_OUT_REQ
Controls
Monitoring
N_CORRECTED[3:0]
GOOD_FRAME
CONTROL[1:0]
NBCH[15:0]
KBCH[15:0]
Speed
FPGA
Spartan-3
Virtex-5
Clock
(max)
73 MHz
166 MHz
Decoder input data rate
(max)
356 Mbits/s
(51840,51648,12)
Device Utilization Summary
Device: Xilinx Spartan-3
Number of slices
Flip Flops
4 input LUTs
RAMB16
18x18 multiplier
GCLKs
Total equivalent gate count
4906
5002
7910
9
0
1
679,436
Device: Xilinx Virtex-5
Number of slices
Flip Flops
LUTs
RAMB18X2s
DSP
GCLKs
2626
4995
6919
5
0
1
423 Mbits/s
(58320,58192,8)
DVB-S2 BCH
810 Mbits/s
(51840,51648,12)
The DVB-S2 standard lists includes 21 variants of
long BCH codes. Each variant is identified by its
code block size Nbch, uncoded block size Kbch, error
correction capability t and frame type.
963 Mbits/s
(58320,58192,8)
The decoder architecture is such that the three
decoding stages are pipelined and the input data is
stored in a 128 Kbits elastic buffer until the error
locations are found. Therefore, it is possible to input
a new frame even before the previous one is
completely decoded.
The processing time budget for each decoding stage
can be expressed as follows:
1. syndrome computation: 16.5 * t clocks.
2. error location polynomial: 328 * t clocks in
the worst case, when t errors are present in
the received frame.
3. factoring the error location polynomial
before the first output byte: (216 Nbch)/8 +
4 clocks for GF(216)
4. Output: Kbch/8 clocks
Using the above information, one can compute the
maximum decoding speed for each DVB-S2 BCH
code variant. For example, the best decoding speed
The VHDL code implements all 21 variants listed
in tables 5a and 5b of the specifications [1].
Examples (see [1] for complete list)
Kbch
Nbch
t
16008
16200
12
51648
51840
12
53840
54000
10
58192
58320
8
3072
3240
12
Frame
normal
normal
normal
Normal
short
The codes for normal frames are computed in
GF(216), whereas the short frame codes are
computed over GF(214).
The primitive polynomials used to generate the
Galois fields are
x16 + x5 + x3 + x2 + 1 for GF(216) and
x14 + x5 + x3 + x + 1 for GF(214)
Matlab:
Primpoly(16,min)
Primpoly(14,min)
The specification document lists the 12 minimum
polynomials gi(x) for GF(216) and GF(214) in tables
6a and 6b respectively.
By multiplying the first 8, 10 or 12 minimum
polynomials, we can construct the generator
polynomials for four configurations: GF(216)
t=8,10,12 and GF(214) t=12.
The resulting generator polynomials can be
represented by their binary coefficients as listed
below:
constant GENPOLY0:
std_logic_vector(128 downto 0) :=
"1" & x"1c07255f712797bd19fc6d7504f9662B";
constant GENPOLY1:
std_logic_vector(160 downto 0) :=
"1" &
"60150CEDFC2A331F6A785703EFD12301B8BB6591"
;
constant GENPOLY2:
std_logic_vector(192 downto 0) := "1" &
x"4E260E83845C511C50CF2CD8DC350889034785F7
660255E7";
constant GENPOLY3:
std_logic_vector(168 downto 0) := "1" &
x"4062DBEA9869B262CD23A39069528FE7D7D11905
A5";
The encoder uses these generator polynominals to
generate the (Nbch Kbch) parity bits appended to the
Kbch input data bits. As described in section 5.3.1 of
the DVB-S2 specifications [1], the parity bits are
the remainder of a polynomial division of the
shifted input bits by the generator polynomial.
DVB-S2 BCH Decoding
Decoding a BCH block is done in three steps.
1. compute the syndromes
2. derive the error location polynominal
3. find the roots of the error location
polynominal and correct the bit errors.
The remainder bj(x) is then evaluated for i as
Si = bj(i).
The roots of gj(x) are as follows:
Minimum polynomial gj(x) Roots i
g1(x)
, 2, 4, 8, 16
g2(x)
3, 6, 12, 24
g3(x)
5, 10, 20
g4(x)
7, 14
g5(x)
9, 18
g6(x)
11, 22
g7(x)
13
g8(x)
15
g9(x)
17
g10(x)
19
g11(x)
21
g12(x)
23
When a received frame is error-free, all syndromes
are zero.
Verifying the syndrome computation is easy.
Assumming two bit errors at locations 16191 and
16184 (with bit locations being numbered from
Nbch-1 (first bit received) to 0), then
S1 = 16191 + 16184
S2 = (16191)2 + (16184)2
S3 = (16191)3 + (16184)3
S2t = (16191)2t + (16184)2t
Matlab:
prim_poly16 = primpoly(16,min);
alpha = gf(2,16,prim_poly16);
s1 = alpha^16191 + alpha^16184;
s2 = (alpha^16191)^2 + (alpha^16184)^2
s3 = (alpha^16191)^3 + (alpha^16184)^3
s24 = (alpha^16191)^24 + (alpha^16184)^24
Syndromes
To compute a syndrome Si , one must first divide
the input block by the twelve polynomials gj(x),
where gj(x) represent the minimum polynomials of
i .for i = 1 to 2t (see table below). The twelve
minimum polynomials gj(x) are listed in the DVBS2 specifications in Tables 6a and 6b.
Error Location Polynomial
The Berlekamp-Massey algorithm is implemented
to find the error location polynomial.
(x) = (1+L1x) (1+L2x) (1+L3x) (1+L4x)
where Li are the error locations.
At the end of this step, the error location
polynomial is expressed as
(x) = 0 + 1x+ 2x2 + 3x3 +
Comparing the VHDL simulation with Matlab is
easy. Let us assume two bit errors at locations
16191 and 16184 (with bit locations being
numbered from Nbch-1 (first bit received) to 0), then
the error location polynomial is computed by
expanding (1+16191x) (1+16184x)
alpha = gf(2,16,prim_poly16);
p1 = [alpha^16191 1];
p2 = [alpha^16184 1];
elp = conv(p1,p2);
Factoring and Error Correction
Chiens search circuit [3] is used to factor the error
location polynomial (x). While the data bit at
location Li is streamed to the output, the algorithm
assesses whether Li is a root of (x). If so, it is
erroneous and is corrected.
128Kbits of block RAM is used as elastic buffer to
temporarily store the received bits while error
decoding takes place.
Matlab:
prim_poly16 = primpoly(16,min);
Flow Control
The decoder input first goes through an input elastic buffer to regulate the flow.
The buffer output data flow is sent to two components: the syndrome computation bch_syndromes.vhd and the
error correction bch_ec.vhd. Thus, both components are able to control the data flow from the input elastic
buffer using their flow control signals SAMPLE0A_CLK_REQ and SAMPLE0B_CLK_REQ respectively.
Syndromes computation is performed on the fly. Upon reading the last frame byte from the input elastic buffer,
bch_syndromes.vhd exercises its SAMPLE0A_CLK_REQ flow control signal to immediately stop the input
flow before a new start of frame. The end of syndromes computation is marked by the availability of the
syndromes (SYNDROME1 through 24) and a pulse SYNDROME_SAMPLE_CLK. At this point
bch_syndromes.vhd is ready for the next input frame.
The syndromes are passed to bcherrorlocator.vhd to compute the error location polynomials. The computation
is triggered by the SYNDROME_SAMPLE_CLK pulse and ends at the ELP_CLK pulse. The resulting error
location polynomials are available in ELP1 through 12. In the special case of an error-free frame, there is no
need to compute the error location polynomials. The ALL_ZERO_SYNDROMES net goes high when this
happens.
The final decoding step, error correction, is implemented within the bch_ec.vhd component. This component
includes a 128 Kbit elastic buffer large enough receive a new frame while processing the previous one. The
purpose of the SAMPLE0B_CLK_REQ flow control flag is stop the input data flow unless at least 1/32th of the
internal elastic buffer is available.
Typical bch_dec.vhd capture. Includes input frames with correctable and uncorrectable errors.
The flow control is primarily located within bch_ec.vhd. It is a little bit complex. There are five key events in
the life of a BCH frame decoding, in the order of occurrence:
- input start of frame pulse SOF_IN
- received all input data (excluding parity bits) INPUT_DATA_COMPLETE
- syndrome ready pulse SYNDROME_SAMPLE_CLK
- error location ready pulse ELP_CLK_IN
- all decoded bytes sent out OUTPUT_DATA_COMPLETE
Note: the error location computation is skipped if ALL_ZERO_SYNDROMES is high.
0 idle
Syndromes
ready (1)
SOF_IN (1)
all-zero
syndromes
?
Input
complete (2)
yes
no
Syndromes
ready (1)
Syndromes
ready (1)
SOF_IN(2)
5
all-zero
syndromes
?
4
yes
all-zero
syndromes
?
yes
no
no
ELP ready (1)
SOF_IN (2)
Output
complete (1)
SOF_IN (2)
ELP ready (1)
Input
complete (2)
1 frame processed
0 frame pending
Output
complete (1)
1 frame processed
1 frame pending
Input
complete (2)
ELP ready (1)
Output
complete (1)
1 frame processed
1 frame pending
Typical bch_ec.vhd capture. Includes frames with correctable and uncorrectable errors.
Configuration Management
Reference documents
[1] ETSI EN 302 307, Section 5.3 FEC encoding
[2] Shift-Register Synthesis and BCH Decoding,
James L. Massey, IEEE Transactions on
Information Theory, January 1969.
[3] Error Control Coding, Fundamentals and
Applications, Shu Lin / Daniel Costello.
The current software revision is 1.
VHDL development environment
The VHDL software was developed using the
Xilinx ISE 8.2, 9.1 and 10.1 development
environments. The synthesis tool is Xilinx XST.
Target FPGA
The VHDL code is ready-to-use with Xilinx
Spartan-4, Virtex-4 and Virtex-5 FGPAs. Other
FPGAs may need very minor adjustments.
Xilinx-specific code
The VHDL source code was written in generic
VHDL with few Xilinx primitives. No Xilinx
CORE is used. The Xilinx primitives are:
- BUFG
- RAMB16_S9_S9
7
Decoder
VHDL software hierarchy
The code is stored with one, and only one, entity
per file.
Encoder
The decoder hierarchical structure reflects the three
successive decoding steps:
1. bch_syndromes.vhd: compute the
syndromes
2. bcherrorlocator.vhd: derive the error
location polynominal
3. bch_ec.vhd: find the roots of the error
location polynominal and correct the bit
errors.
Clock / Timing
The software uses a single master clock (CLK)
which serves as input clock, output clock and signal
processing clock.
Test Benches
Several test benches are included for end-to-end
and component-level VHDL simulation:
tbbchencdec2.vhd: end-to-end simulation
including encoder, decoder and added bit
errors.
ComBlock Ordering Information
COM-1209ASOFT
High-speed DVB-S2 BCH
encoder & decoder. VHDL source code.
Contact Information
MSS 18221-A Flower Hill Way
Gaithersburg, Maryland 20879 U.S.A.
Telephone: (240) 631-1111
Facsimile: (240) 631-1676
E-mail: [email protected]