DSP Arithmetic
DSP Arithmetic
DSP Arithmetic
Contents
Fixed point representation Floating point representation Some math operations:
Addition Subtraction Multiplication Division
Comparison
2
Introduction
Practical DSP implementation consideration: Possible quantization errors Arithmetic errors Possible Overflow should take into
A DSP processors data format determines its ability to handle signals of different precisions, dynamic ranges, and SQNRs. In order to write efficient programs for DSP applications, we must understand how the processor manipulates data.
3
Uses twos complement to represent signed numbers. The first bit is used as signed bit.
Radix pt.
1 Sign bit
m integer bits
n fractional bits
Q15 format
For example, a 16-bit number that uses 1 sign bit and 15 bits for the fractional part is called Q0.15 format or simply the Q15 format. Q.15 format is commonly used in DSP systems and data must be properly scaled so that their value lies between -1 and 0.999969482421875.
10
Decimal fraction
0 1/8 2/8
Fractional notation
000 001 010
3
4 5 6 7
3/8
4/8 5/8 6/8 7/8
011
100 101 110 111
Decimal fraction
3/4 2/4 1/4
Fractional notation
011 010 001
0
-1 -2 -3 -4
0
-1/4 -2/4 -3/4 -1
000
111 110 101 100
Example
Represent the decimal number, 0.95624 as
A Q3 number, and A Q4 number
Q3 number: A Q3 is a 2s complement number with one sign bit and 3 fractional bits. 0.9562423 = 7.64992 This no. can be rounded to 7=0111.
13
Example (contd.)
Q4 number: A Q4 is a 2s complement number with one sign bit and 4 fractional bits.
0.9562424 = 15.29984
This no. can be rounded to 15=01111.
14
Example (contd.)
Errors in representation: Case-1: Q3 notation error = (7.649927)8=0.08124 Case-2: Q4 notation
error = (15.2998415)16=0.01874
The error in representing the number is often referred to as coefficient quantization error.
15
Implementation
Most fixed-point DSP processors use twos complement fractional numbers in different Q formats. However, assemblers only recognize integer values.
The programmer must keep track of the position of the binary point when manipulating fractional numbers in assembly programs. The following steps convert a fractional number in Q format into an integer value that can be recognized by the assembler. Let us see this with an example:
16
Implementation (contd.)
Assume that the coefficient used by the assembler is 1.18. the DSP processor uses Q15 format. Step 1: normalize the fractional number to the range determined by the desired Q format.
For Q15 format the range is [-1,1). Normalize the number to this range. Thus, 1.18/2 = 0.59
Step 2: Multiply the normalized fractional number by 2n, where n is the no. of fractional bits.
Multiply 0.59 by 215. thus, 0.59 32,768 = 19,333.12
17
Implementation (contd.)
The arithmetic result obtained by a DSP processor is in the integer form. It can be interpreted as a fractional value by dividing by 2n. This is equivalent to shifting the binary point n bits to the left. In DSP implementation, it is not always necessary to use Q.15 format throughout the DSP algorithm; instead, we can use different Q formats for different dynamic range requirements.
18
Binary addition-Example
Addition of two 4-bit numbers represented in Q3 format: 0.100 0.5 + 0.011 0.375 = 0.111 0.875
No overflow
Overflow
Thus, addition of two numbers in fractional representation can result in overflow.
19
Binary multiplication-Example
When multiplying two 4-bit numbers in Q.3 format requires a 7bit word in Q.6 format to store the product. and there is no overflow. 0.111 0.875 0.110 0.75 = 0.101010(0.65625) We want to store the result in a 4-bit word and hence, truncate the result to the four most significant bits(0.101) Then, the error is 0.65625-0.625=0.03125 Multiplication in Q format does not result in overflow except in the case of 1 1 = 1(which is not in the range)
20
Binary division
Hardware implementation of division is expensive. Therefore, most processors do not provide a single-cycle divide instruction supported by the hardware.
For an N-bit fractional number, fractional division can be realized by repeating the conditional subtraction instruction (N-1) times.
21
22
23
24
23
The decimal equivalent, X, of a normalized IEEE floating point number is given by, = 1 (1. ) 2127 Where ,
F is the mantissa in 2s complement binary fraction E is the exponent in excess 127 form s=0 for positive no.s, s=1 for negative no.s
25
Example
We are given two floating point numbers = 2.44 = 10 1.22 2128127 = 12.16 = 10 (1.52) 2130127 Here, > ||
+ = [10 (1.52) 10 1.22 2 So, in the result : s=1 mantissa=0.215
130128
] 2130127
exp=3+127=130
27
= 1 1.8544 24 = 29.6704
exp=4+127=131
28
The mantissas of the two numbers are multiplied, while the exponent terms are added without the need to align them.
Most floating point processors perform automatic normalization so that numbers are properly shifted and aligned. The programmer just needs to take care of the overflow problem. However, due to large dynamic range scaling is rarely necessary. Hence, floating point processors are easier to use than fixed point processors.
29
COMPARISON
Between Fixed Point and Floating point notations
30
Comparison
Fixed point
16- or 24- bit devices
Floating point
32-bit devices
Comparison
Fixed point
Faster clock rate
Floating point
Slower clock rate
Functional units are complex, more silicon area required. More expensive
Higher power consumption
32
References
Sen M Kuo, Woon-seng S. Gan, Digital Signal ProcessorsArchitectures, Implementations and Applications Emmanuel Ifeachor, Barrie W. Jervis, Digital Signal Processing
Steven M. Smith, The Scientist And Engineers Guide To Digital Signal Processing
33