DSP Arithmetic

DSP Arithmetic
Contents
Fixed point representation Floating point representation Some math operations:
Addition Subtraction Multiplication Division
Comparison
2
Introduction
Practical DSP implementation consideration: Possible quantization errors Arithmetic errors Possible Overflow should take into
A DSP processors data format determines its ability to handle signals of different precisions, dynamic ranges, and SQNRs. In order to write efficient programs for DSP applications, we must understand how the processor manipulates data.
3
FIXED POINT NOTATION

4
Some Fixed-point processors:

TMS320C64xx processors ADSP2101 processor
Fixed point notation

Fixed point DSPs usually represent each number with a minimum of 16 bits, although a different length can be used. There are four common ways that these 216( 65,536) possible bit patterns can represent a number.
Unsigned integer Signed integer Unsigned fraction Signed fraction
6
Fixed Point Representation

In unsigned integer, the stored number can take on any integer value from 0 to 65,535. Example: Consider 4-bit representation. We can represent numbers in the range 0 to 15. 410=01002 710=01112 810=10002 However, if the result of an arithmetic representation exceeds 1510, overflow occurs.
Fixed Point Representation

Similarly, signed integer uses two's complement to make the range include negative numbers, from -32,768 to 32,767. Example: Consider 4 bit representation. We can represent numbers in the range -8 to 7. 510=0101 -510=1011
Uses twos complement to represent signed numbers. The first bit is used as signed bit.
Fractional fixed point notation

Used for representing numbers with both integer and fractional parts. The Qm.n convention uses m bits to represent the integer portion of the number and n bits to represent the fractional portion. Total no. of bits: N=m+n+1
Radix pt.
1 Sign bit
m integer bits
n fractional bits
Q15 format
For example, a 16-bit number that uses 1 sign bit and 15 bits for the fractional part is called Q0.15 format or simply the Q15 format. Q.15 format is commonly used in DSP systems and data must be properly scaled so that their value lies between -1 and 0.999969482421875.
10
Fractional Fixed point notation

With unsigned fraction notation, the 65,536 levels are spread uniformly between 0 and 1.
Number 0 1 2
Decimal fraction
0 1/8 2/8
Fractional notation
000 001 010
Example: Consider 3-bit unsigned fraction representation
3
4 5 6 7
3/8
4/8 5/8 6/8 7/8
011
100 101 110 111
Fractional Fixed point notation

Lastly, the signed fraction format allows negative numbers, equally spaced between -1 and 1.
Number 3 2 1
Decimal fraction
3/4 2/4 1/4
Fractional notation
011 010 001
Example: Consider 3-bit signed fraction representation.
0
-1 -2 -3 -4
0
-1/4 -2/4 -3/4 -1
000
111 110 101 100
Example
Represent the decimal number, 0.95624 as
A Q3 number, and A Q4 number
Q3 number: A Q3 is a 2s complement number with one sign bit and 3 fractional bits. 0.9562423 = 7.64992 This no. can be rounded to 7=0111.
13
Example (contd.)
Q4 number: A Q4 is a 2s complement number with one sign bit and 4 fractional bits.
0.9562424 = 15.29984
This no. can be rounded to 15=01111.
14
Example (contd.)
Errors in representation: Case-1: Q3 notation error = (7.649927)8=0.08124 Case-2: Q4 notation
error = (15.2998415)16=0.01874
The error in representing the number is often referred to as coefficient quantization error.
15
Implementation
Most fixed-point DSP processors use twos complement fractional numbers in different Q formats. However, assemblers only recognize integer values.
The programmer must keep track of the position of the binary point when manipulating fractional numbers in assembly programs. The following steps convert a fractional number in Q format into an integer value that can be recognized by the assembler. Let us see this with an example:
16
Implementation (contd.)
Assume that the coefficient used by the assembler is 1.18. the DSP processor uses Q15 format. Step 1: normalize the fractional number to the range determined by the desired Q format.
For Q15 format the range is [-1,1). Normalize the number to this range. Thus, 1.18/2 = 0.59
Step 2: Multiply the normalized fractional number by 2n, where n is the no. of fractional bits.
Multiply 0.59 by 215. thus, 0.59 32,768 = 19,333.12
Step 3: round the product to the nearest integer.
Round the decimal value 19,333.12 to obtain 19333 = 4B85h
17
Implementation (contd.)
The arithmetic result obtained by a DSP processor is in the integer form. It can be interpreted as a fractional value by dividing by 2n. This is equivalent to shifting the binary point n bits to the left. In DSP implementation, it is not always necessary to use Q.15 format throughout the DSP algorithm; instead, we can use different Q formats for different dynamic range requirements.
18
Binary addition-Example
Addition of two 4-bit numbers represented in Q3 format: 0.100 0.5 + 0.011 0.375 = 0.111 0.875
No overflow
0.101 0.625 + 0.011 0.375 = 1.000 1
Overflow
Thus, addition of two numbers in fractional representation can result in overflow.
19
Binary multiplication-Example
When multiplying two 4-bit numbers in Q.3 format requires a 7bit word in Q.6 format to store the product. and there is no overflow. 0.111 0.875 0.110 0.75 = 0.101010(0.65625) We want to store the result in a 4-bit word and hence, truncate the result to the four most significant bits(0.101) Then, the error is 0.65625-0.625=0.03125 Multiplication in Q format does not result in overflow except in the case of 1 1 = 1(which is not in the range)
20
Binary division
Hardware implementation of division is expensive. Therefore, most processors do not provide a single-cycle divide instruction supported by the hardware.
For an N-bit fractional number, fractional division can be realized by repeating the conditional subtraction instruction (N-1) times.
21
FLOATING POINT ARITHMETIC
22
Floating Point Processors

TMS320C3x TMS32067x ADSP2106x Floating point formats allow numbers to be represented with a large dynamic range. Thus, floating point arithmetic can reduce the problem of overflow that occurs in fixed point arithmetic.
23
Floating point formats

A binary floating point number X is represented as the product of two signed numbers, the mantissa M and the exponent E. = 2 The exponent determines the range of numbers that can be represented, the mantissa the accuracy of the numbers. For example, if mantissa16 bits, exponent8 bits:
Range of numbers that can be represented: 0.5 2128 (1 215 ) 2128
24
IEEE floating point format

IEEE 754 Standard:
32 s 31 Exponent (8 bit)
23
22 Mantissa (23 bit)
Fig: Floating point representation(IEEE single precision)
The decimal equivalent, X, of a normalized IEEE floating point number is given by, = 1 (1. ) 2127 Where ,
F is the mantissa in 2s complement binary fraction E is the exponent in excess 127 form s=0 for positive no.s, s=1 for negative no.s
25
Floating point addition

In order to perform floating point addition, we have to adjust the exponent of the smaller number to match that of the bigger number. Consider
= 11 1. 1 21127 and Y= 12 (1. 2) 22127

= +
26
Example
We are given two floating point numbers = 2.44 = 10 1.22 2128127 = 12.16 = 10 (1.52) 2130127 Here, > ||
+ = [10 (1.52) 10 1.22 2 So, in the result : s=1 mantissa=0.215
130128
] 2130127
exp=3+127=130
27
Floating point multiplicationExample

= 2.44 = 10 1.22 2128127 = 12.16 = 10 1.52 2130127
= [10 (1.52) 2130127 10 1.22 2
128127
= 1 1.8544 24 = 29.6704
So, in the result : s=1 mantissa=0.8544
exp=4+127=131
28
The mantissas of the two numbers are multiplied, while the exponent terms are added without the need to align them.
Most floating point processors perform automatic normalization so that numbers are properly shifted and aligned. The programmer just needs to take care of the overflow problem. However, due to large dynamic range scaling is rarely necessary. Hence, floating point processors are easier to use than fixed point processors.
29
COMPARISON
Between Fixed Point and Floating point notations
30
Comparison
Fixed point
16- or 24- bit devices
Floating point
32-bit devices
Limited dynamic range

Overflow and quantization errors must be resolved.
Large dynamic range

Easier to program as no scaling is required. Better C compiler efficiency; can be developed in C.
31
Poorer C compiler efficiency; normally programmed in assembly.
Comparison
Fixed point
Faster clock rate
Floating point
Slower clock rate
Functional units are simpler, less silicon area required.
Functional units are complex, more silicon area required. More expensive
Higher power consumption
32
Cheaper Lower power consumption
References
Sen M Kuo, Woon-seng S. Gan, Digital Signal ProcessorsArchitectures, Implementations and Applications Emmanuel Ifeachor, Barrie W. Jervis, Digital Signal Processing
Steven M. Smith, The Scientist And Engineers Guide To Digital Signal Processing
33

DSP Arithmetic

Uploaded by

Copyright:

Available Formats

DSP Arithmetic

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DSP Arithmetic

Uploaded by

Copyright:

Available Formats

DSP Arithmetic

FIXED POINT NOTATION

Some Fixed-point processors:

Fixed point notation

Fixed Point Representation

Fixed Point Representation

Fractional fixed point notation

Fractional Fixed point notation

Example: Consider 3-bit unsigned fraction representation

Fractional Fixed point notation

Example: Consider 3-bit signed fraction representation.

Step 3: round the product to the nearest integer.

Round the decimal value 19,333.12 to obtain 19333 = 4B85h

0.101 0.625 + 0.011 0.375 = 1.000 1

FLOATING POINT ARITHMETIC

Floating Point Processors

Floating point formats

Range of numbers that can be represented: 0.5 2128 (1 215 ) 2128

IEEE floating point format

22 Mantissa (23 bit)

Fig: Floating point representation(IEEE single precision)

Floating point addition

= 11 1. 1 21127 and Y= 12 (1. 2) 22127

Floating point multiplicationExample

So, in the result : s=1 mantissa=0.8544

Limited dynamic range

Large dynamic range

Poorer C compiler efficiency; normally programmed in assembly.

Functional units are simpler, less silicon area required.

Cheaper Lower power consumption

You might also like