Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
3 views14 pages

Week 12

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views14 pages

Week 12

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

18/11/2024

Digital System Design


CS-431

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

7 Series FPGAs: Clock Management Tile (CMT)

• Clock management tile (CMT)


– Performs Frequency synthesis,
– Clock de-skew,
– Jitter-filtering
– High input frequency range

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

1
18/11/2024

CMT

Clock signal from


outside world Daughter clocks
Clock used to drive
internal clock trees
Manager
or output pins
etc.

Special clock
pin and pad

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

CMT: Jitter Removal


• In the real world clock edges may arrive a little early or a little late.
• A fuzzy clock would result (jitter) due to the delay encountered.
• The FPGA clock manager can be used to detect and correct for this jitter and
provide a “clean” daughter clock signal for use inside the device.
1 2 3 4

Ideal clock signal

Real clock signal with jitter


Cycle 1
Cycle 2
Cycle 3
Cycle 4
Superimposed cycles

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

2
18/11/2024

CMT: Frequency Synthesis

• The clock manager can be used to generate daughter clocks with frequencies
that are derived by multiplying or dividing the original signal.

1.0 x original clock frequency

2.0 x original clock frequency


.5 x original clock frequency

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

CMT: Phase Shifting

• Certain designs require the use of clocks that are phase shifted (delayed) with
respect to each other.
• Some clock managers allow you to select from fixed phase shifts of common values
such as 1200 and 2400 (for a three-phase clocking scheme)

0o Phase shifted
90o Phase shifted
180o Phase shifted
270o Phase shifted
Embedded Systems Design: A Unified Hardware/Software Introduction, (c)
2000 Vahid/Givargis

3
18/11/2024

Growing DSP Performance Gap

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

Typical DSP Operation

Diagram of a typical FIR filter


- Parallel computing process by nature
- N number of taps
- N multiplications should happen in
parallel

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

4
18/11/2024

Serial vs. Parallel DSP Processing

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

Embedded Multipliers/ DSP Slices


• Some functions, like multipliers are inherently slow if they are implemented by
connecting a large number of programmable logic blocks together.

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

5
18/11/2024

Main Components and Functionality


• Multiplier: A high-speed multiplier capable of performing signed and unsigned integer and fixed-
point multiplications.

• Adder/Accumulator: Allows results of multiplications to be accumulated or added/subtracted,


enabling the implementation of multiply-accumulate (MAC) operations, essential for many DSP
algorithms.

• Pre-Adder: Supports pre-addition of inputs, useful for symmetric FIR filter implementations.

• Pipeline Registers: To increase the operating frequency, the DSP Slice includes several
pipelining stages, which help reduce delays and achieve high clock speeds.

• Control and Configuration Logic: The DSP Slice provides configuration settings to control its
operational modes and to manage inputs and outputs dynamically.
Embedded Systems Design: A Unified Hardware/Software Introduction, (c)
2000 Vahid/Givargis

DSP Slices

• All 7 series FPGAs contain DSP48E1 cell. It has the following


features
– 25x18 signed multiplier
– 48-bit add/subtract/accumulate
– 25 bit pre-adder
– Pipeline registers for high speed
– Pattern detector
– SIMD operators
– Cascade paths
– Dynamic pipeline control
• DSP48E1 slices inferred, instantiated or accessed using IP cores
Embedded Systems Design: A Unified Hardware/Software Introduction, (c)
2000 Vahid/Givargis

6
18/11/2024

Typical Slice Features

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

FIR Filter Mapped to DSP Slices

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

7
18/11/2024

Non-DSP Functions (Addition)

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

DSP48E1 cell
CASCOUT
CARRY
BCOUT

ACOUT

MULT
SIGNOUT

PCOUT

18

18 48 A:B
B 18 Dual B
Register 4
18
6
CARRY
30 0 X OUT
P

25 X 86 43
Dual A, 30 M
A 30 18 43
D
Register 0 48
25 1 Y P
D 25 With P
Pre- C’
adder C 2
0 >>17 = P PATTERN_
C 48 DETECT
Z
>>17 Carry

18 30
7 3 4 48
PATTERN

5
CarryInSel

ALUMode
INMODE

OpMode

CarryIn

C’
CASCIN
CARRY

SIGNIN
MULT
BCIN

ACIN

PCIN

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

8
18/11/2024

X, Y, and Z Multiplexers

• Adder/subtractor operates on X, Y, Z and ALUMODE

0000
Operation

Z + X + Y + CIN
CIN operands 0001 -Z + (X + Y + CIN) – 1

– Table shows basic operations 0010 -Z – X – Y – CIN – 1

0011 Z – (X + Y + CIN)
• X, Y, and Z multiplexers allow for Others Logic Operations

dynamic OPMODEs
• Multiplier output requires both X and Y
multiplexers
Normal or 17-bit right
shifted with MSB fill for
multi precision arithmetic

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

Apply Your Knowledge


OPMODE
Controls the behavior of X, Y, and Z multiplexers

1) Given this OPMODE table, what is the


OPMODE for the following functions?
– C + A:B
– A*B + C
– P + C + PCIN

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

9
18/11/2024

Two-Input Logic Functions


ALUMODEs
• 48-bit logic operations Logic Unit Mode OPMODE[3:2] ALUMODE[3:0]

– XOR, XNOR, AND, NAND, OR, X XOR Z 00 0100

NOR, NOT X XNOR Z 00 0101

X XNOR Z 00 0110
ALUMODE[3:0] X XOR Z 00 0111

X AND Z 00 1100

X AND (NOT Z) 00 1101

X NAND Z 00 1110
0 (NOT X) OR Z 00 1111
P X
A:B X XNOR Z 10 0100

X XOR Z 10 0101
0
1 Y P
X XOR Z 10 0110

X XNOR Z 10 0111

0 X OR Z 10 1100
PCIN Z
P X OR (NOT Z) 10 1101
C X NOR Z 10 1110

(NOT X) AND Z 10 1111

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


OPMODE[3:0]
2000 Vahid/Givargis

Pattern Detect and SIMD

..

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

10
18/11/2024

Dual B Register

● B input to multiplier is controlled by INMODE[4]


- Dynamically selects B1/B2 pipeline level
● B input to X MUX and BCOUT cascade outputs
are statically controlled by bitstream options

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

Dual A, D Registers and Pre-Adder

● A input to multiplier is controlled by INMODE[3:0]


- Dynamically selects A1/A2 pipeline level
- Dynamically selects add/subtract
- Dynamically selects Zero for A or D
● X MUX and ACOUT cascade input are statically
controlled.
..

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

11
18/11/2024

Pre-Adder

• The pre-adder can add or subtract the two 25-bit operands on the A and
the D inputs before the result drives the multiplier
• Benefits
– Perfect for operations using symmetrical coefficients
– Doubles the efficiency of symmetric FIR and symmetric IIR and transpose
convolution filters
– Half the power consumption compared to architectures without a pre-adder
– A small change with a big benefit

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

Symmetrical Filters

When the coefficients are symmetrical


- The pre-adders either reduce the number of multiplications by
50%
- Factorizing the taps replaces one multiplication by a pre-
addition (or pre-subtraction)

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

12
18/11/2024

Six -Tap Transpose FIR Filter Without Pre-Adder

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

Six -Tap Transpose FIR Filter Using the Pre-Adder

Optimized implementation supported by XST using only


three DSP slices

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

13
18/11/2024

Dynamic Pipeline Control

• The 7 series FPGA DSP slice has dynamic pipeline control on the A and B
registers
– User can select which of the two pipeline registers to use for calculations on a
clock-by-clock basis
• Benefits
– Allows an operation to reuse the same operand in subsequent cycles

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

Application: Sequential Complex Multiply

Embedded Systems Design: A Unified Hardware/Software Introduction, (c)


2000 Vahid/Givargis

14

You might also like