DSP-Digital Signal Processors
Subject Code: EC-306 ECE Department, DTU
OBJECTIVE
Digital Signal Processing
Common DSP Algorithms
Features of DSP
DSP Data Path
DSP Memory
Situated Computing
An embedded system is situated in an external environment
Sensors provide input about external environment
Input signal processed by the embedded system
Sensors can be designed for virtually every physical and chemical quantity
weight, velocity, acceleration, electrical current, voltage, temperatures etc.
CCD Sensors: Image Sensors
Light-sensitive silicon solid-state device composed of many cells
When exposed to light, each cell becomes electrically charged
Current at each site integrated over a period of time ( e.g. 15 ms) to get reasonable SNR
Exposure control
Charges shifted out using CCD shift register
Serial access of pixels
Analogue current converted to digital form using ADC operating at pixel rate
Biometric Sensors
Matrix of 256 x 256
sensing elements
Example: Fingerprint sensor (© Siemens, VDE)
Digital Signal Processing
Processing of digitally represented signals
Signals represented digitally as sequences of samples
Digital signals obtained from physical signals via transducers (e.g., microphones)
and analog-to-digital converters (ADC)
Digital Signal Processor (DSP): Electronic system that processes digital signals
Features of DSP:
Most DSP tasks require:
Repetitive numeric computations
Attention to numeric fidelity
High memory bandwidth, mostly via array accesses
Real-time processing
DSPs must perform these tasks efficiently while minimizing:
Cost
Power
Memory use
Development time
DSP Applications
Audio
Coding, decoding, surround-sound
Communication
scrambling, cellular phones, software radios
Control
Robotics, disk drive control, motor control
Medical
Diagnostic equipments, hearing aids
Defense
Radar & sonar processing, missile guidance
Common DSP Algorithm
Algorithms
Filters: To remove unwanted noise, different components at different frequency bands
FIR (Finite Impulse Response) and IIR (Infinite Impulse Response)
Transformation: Time domain to frequency domain in case of speech signal
Image signal, from spatial domain to frequency domain
Frequency- time transformations - FFT
Correlation: For doing signal classifications
These are the common algorithm, DSP must have dedicated
Hardware to facilitate implementation of these common
algorithm.
FIR Filtering:
Each tap (M+1 taps total) requires
Two data fetches
Multiply
Accumulate
Memory write-back to update
Simple DSP (1982): Texas Instrument’s TMS32010
16-bit fixed-point
“Harvard architecture”
separate instruction, data memories Instruction
Accumulator Memory
Specialized instruction set Processor
Load and Accumulate Data
390 ns Multiple-Accumulate Memory
(MAC) time; 228 ns today Datapath:
These organization of the data path is primarily to facilitate multiply
and accumulate operation, which is the key operation in filter Mem
T-Register
implementation.
Here X4, H4, ... are direct (absolute) memory addresses: Multiplier
LT X4 ; Load T with x(n-4)
MPY H4 ; P = H4*X4 P-Register
ALU
LTD X3 ; Load T with x(n-3)(x(n-4) =x(n-3)
; Acc = Acc + P
MPY H3 ; P = H3*X3 Accumulator
LTD X2
MPY H2
Two instructions per tap
Features Common to Most DSP Processors
Features Common to Most DSP Processors
Data path configured for DSP
Specialized instruction set
Multiple memory banks and buses (for processing multiple inputs simultaneously)
Specialized addressing modes
Specialized execution control techniques
Specialized peripherals for getting inputs from sensors for DSP
DAC for outputs and ADC for the inputs
Features of DSP: Data Paths
Specialized Hardware to perform key arithmetic operations (multiply and accumulate) in one cycle
Hardware support for managing
Shifters : adjustment of mantissas of floating point numbers
Guard bits: additional bits to increase accuracy of results
Saturation: preventing wrap around on overflow or underflow
DSP Data Path: Precision
Word size affects precision of fixed point ( Qm.n format) numbers
Precision is defined as the smallest step between two consecutive numbers that can be obtained for a given number
of bits
DSPs have 16-bit, 20-bit, or 24-bit data words
Floating Point DSPs cost 2X-4X vs. fixed point, slower than fixed point, require more hardware, higher power consumption
Floating point support simplify development
DSP Data Path: Saturation
Saturation:
Set to most positive (2N–1–1) or
most negative value(–2N–1)
No wrap around
Special arithmetic instructions
Useful for signal processing operations
DSP Data Path: Multiplier
Specialized hardware performs all key arithmetic operations in 1 cycle
more than 50% of instructions can involve multiplier
=> single cycle latency multiplier
Need to perform multiply-accumulate (MAC)
n-bit multiplier => 2n-bit product
These builds as separate hardware blocks not strictly part of the ALU.
DSP Data Path: Rounding
Even with guard bits, need to round
Three DSP standard options Multiplier
Chopping: remove guard bits with no changes in the retained bits
=> biased approximation, biases results up Shift
Von Neumann rounding: If bits to be removed are all 0, then no changes in the
retained bit; if any of the bits removed are 1, then LSB of the retained bits is set to 1
=> unbiased approximation ALU
Rounding: round to the nearest number or even number in case of a tie;
A 1 is added to the LSB position of the bits to be retained if there are a 1 in the MSB Accumulator
position and/or subsequent bits being removed; in case of a 1 only in MSB, round to
make LSB of the retained bits a zero.
These rounding introduces additional computational and hardware requirements for the
processor.
Multiplier
DSP Data Path: Accumulator
Option 1: accumulator wider than product: guard bits
Example: Motorola DSP
24b x 24b => 48b product, 56b Accumulator
ALU
Option 2: shift right and round product before adder
Accumulator G
DSP Memory
FIR Tap implies multiple memory accesses
Data , coefficients
DSPs want multiple data ports
Some DSPs have ad hoc techniques to reduce memory bandwidth demand
Instruction repeat buffer: do 1 instruction 256 times
Often disables interrupts, thereby increasing interrupt latency
Some recent DSPs have instruction caches
May allow programmer to “lock in” instructions into cache (never be removed from the cache)
Option to turn cache into fast program memory
No DSPs have data caches
Data is coming in a sequence or stored in a buffer, same data is not expected to be
used multiple times
May have multiple data memories
THANK YOU