21EC733- Module 3 DSP Algorithms and Architecture Notes
21EC733- Module 3 DSP Algorithms and Architecture Notes
21EC733- Module 3 DSP Algorithms and Architecture Notes
MODULE-3
3.1 Introduction:
Leading manufacturers of integrated circuits such as Texas Instruments (TI), Analog devices &
Motorola manufacture the digital signal processor (DSP) chips. These manufacturers have developed a
range of DSP chips with varied complexity.
The TMS320 family consists of two types of single chips DSPs: 16-bit fixed point &32-bit floating-
point. These DSPs possess the operational flexibility of high-speed controllers and the numerical
capability of array processors
Accumulators A and B store the output from the ALU or the multiplier/adder block and provide a
second input to the ALU. Each accumulators is divided into three parts: guards bits (bits 39-32), high-
order word (bits-31-16), and low-order word (bits 15- 0), which can be stored and retrieved
individually. Each accumulator is memory-mapped and partitioned. It can be configured as the
destination registers. The guard bits are used as a head margin for computations.
Page 38 21EC733- DSP Algorithms and Architecture notes
Barrel shifter: provides the capability to scale the data during an operand read or write.
No overhead is required to implement the shift needed for the scaling operations. The’54xx barrel
shifter can produce a left shift of 0 to 31 bits or a right shift of 0 to 16 bits on the input data. The shift
count field of status registers ST1, or in the temporary
register T. Figure 3.3 shows the functional diagram of the barrel shifter of TMS320C54xx processors.
The barrel shifter and the exponent encoder normalize the values in an accumulator in a single cycle.
The LSBs of the output are filled with0s, and the MSBs can be either zero filled or sign extended,
depending on the state of the sign-extension mode bit in the status register ST1. An additional shift
capability enables the processor to perform numerical scaling, bit extraction, extended arithmetic, and
overflow prevention operations.
Page 39 21EC733- DSP Algorithms and Architecture notes
Multiplier/adder unit: The kernel of the DSP device architecture is multiplier/adder unit. The
multiplier/adder unit of TMS320C54xx devices performs 17 x 17 2’s complement multiplication with
a 40-bit addition effectively in a single instruction cycle.
In addition to the multiplier and adder, the unit consists of control logic for integer and
fractional computations and a 16-bit temporary storage register, T. Figure 3.4 show the functional
diagram of the multiplier/adder unit of TMS320C54xx processors. The compare, select, and store unit
(CSSU) is a hardware unit specifically incorporated to accelerate the add/compare/select operation.
This operation is essential to implement the Viterbi algorithm used in many signal-processing
applications. The exponent encoder unit supports the EXP instructions, which stores in the T register
the number of leading redundant bits of the accumulator content. This information is useful while
shifting the accumulator content for the purpose of scaling.
Page 40 21EC733- DSP Algorithms and Architecture notes
memory. Figure 3.5(a) and (b) shows the internal CPU registers and peripheral registers with their
addresses. The processors mode status (PMST) registers
that is used to configure the processor. It is a memory-mapped register located at address 1Dh on page
0 of the RAM. A part of on-chip ROM may contain a boot loader and look-up tables for function such
as sine, cosine, μ- law, and A- law.
HM: Hold mode, indicates whether the processor continues internal execution or acknowledge for
external interface.
INTR: Interrupt vector pointer, point to the 128-word program page where the interrupt vectors
reside.
MP/MC: Microprocessor/Microcomputer mode,
MP/MC=0, the on chip ROM is enabled.
MP/MC=1, the on chip ROM is enabled.
OVLY: RAM OVERLAY, OVLY enables on chip dual access data RAM blocks to be mapped into
program space.
AVIS: It enables/disables the internal program address to be visible at the address pins.
DROM: Data ROM, DROM enables on-chip ROM to be mapped into data space.
CLKOFF: CLOCKOUT off.
Data addressing modes provide various ways to access operands to execute instructions and place
results in the memory or the registers. The 54XX devices offer seven basic addressing modes
1. Immediate addressing.
2. Absolute addressing.
3. Accumulator addressing.
4. Direct addressing.
5. Indirect addressing.
6. Memory mapped addressing
7. Stack addressing.
Figure 3.7 Block diagram of the direct addressing mode for TMS320C54xx Processors.
TMS320C54xx have 8, 16 bit auxiliary register (AR0 – AR 7). Two auxiliary register arithmetic units
(ARAU0 & ARAU1)
Used to access memory location in fixed step size. AR0 register is used for indexed and bit reverse
addressing modes.
– operand addressing
MOD _ type of indirect addressing
ARF _ AR used for addressing
ARP depends on (CMPT) bit in ST1
CMPT = 0, Standard mode, ARP set to zero
CMPT = 1, Compatibility mode, Particularly AR selected by ARP
Page 47 21EC733- DSP Algorithms and Architecture notes
Page 48 21EC733- DSP Algorithms and Architecture notes
Table 3.2 Indirect addressing options with a single data –memory operand.
Circular Addressing;
Bit-Reversed Addressing:
o Used for FFT algorithms.
o AR0 specifies one half of the size of the FFT.
o The value of AR0 = 2N-1: N = integer FFT size = 2N
o AR0 + AR (selected register) = bit reverse addressing.
o The carry bit propagating from left to right.
Dual-Operand Addressing:
Dual data-memory operand addressing is used for instruction that simultaneously
perform two reads (32-bit read) or a single read (16-bit read) and a parallel store (16-bit
store) indicated by two vertical bars, II. These instructions access operands using indirect addressing
mode.
If in an instruction with a parallel store the source operand the destination operand point to the
same location, the source is read before writing to the destination. Only 2 bits are available in the
instruction code for selecting each auxiliary register in this mode. Thus, just four of the auxiliary
registers, AR2-AR5, can be used, The ARAUs together with these registers, provide capability to
access two operands in a single cycle. Figure 3.11 shows how an address is generated using dual data-
memory operand addressing.
Page 51 21EC733- DSP Algorithms and Architecture notes
Page 52 21EC733- DSP Algorithms and Architecture notes
Problems:
1. Assuming the current content of AR3 to be 200h, what will be its contents after
each of the following TMS320C54xx addressing modes is used? Assume that the
contents of AR0 are 20h.
a. *AR3+0
b. *AR3-0
c. *AR3+
d. *AR3
e. *AR3
f. *+AR3 (40h)
g. *+AR3 (-40h)
Solution:
a. AR3 ← AR3 + AR0;
AR3 = 200h + 20h = 220h
b. AR3← AR3 - AR0;
AR3 = 200h - 20h = 1E0h
c. AR3 ← AR3 + 1;
AR3 = 200h + 1 = 201h
d. AR3 ← AR3 - 1;
AR3 = 200h - 1 = 1FFh
e. AR3 is not modified.
AR3 = 200h
f. AR3 ← AR3 + 40h;
AR3 = 200 + 40h = 240h
g. AR3 ← AR3 - 40h;
AR3 = 200 - 40h = 1C0h
Page 56 21EC733- DSP Algorithms and Architecture notes
2. Assuming the current contents of AR3 to be 200h, what will be its contents after
each of the following TMS320C54xx addressing modes is used? Assume that the contents of AR0 are
20h
a. *AR3 + 0B
b. *AR3 – 0B
Solution:
a. AR3 ← AR3 + AR0 with reverse carry propagation;
AR3 = 200h + 20h (with reverse carry propagation) = 220h.
b. AR3 ← AR3 - AR0 with reverse carry propagation;
AR3 = 200h - 20h (with reverse carry propagation) = 23Fh.
Recommended Questions:
1. Compare architectural features of TMS320C25 and DSP6000 fixed point digital signal
processors. (Dec.09-Jan.10, 6m)
2. Write an explanatory note on direct addressing mode of TMS320C54XX processors. Give
example. (Dec.09-Jan.10, 6m)
3. Describe the operation of the following instructions of TMS320C54XX processors.
i) MPY *AR2-,*AR4+0B (ii) MAC *ar5+,#1234h,A (iii) STH A,1,*AR2 iv) SSBX
SXM (Dec.09-Jan.10, 8m)
4. With a block diagram explain the indirect addressing mode of TMS320C54XX processor using
dual data memory operand. (June.12, 6m)
5. What is the function of an address generation unit explain with the help of block diagram.
(Dec.12, 6m)
6. Why circular buffers are required in DSP processor? How they are implemented? (Dec.12, 2m)
7. Explain the direct addressing mode of the TMS320C54XX processor with the help of a block
diagram. (Dec.12, 2m)
8. Describe the multiplier/adder unit of TMS320c54xx processor with a neat block diagram.
(May/June2010, 6m)
9. Describe any four data addressing modes of TMS320c54xx processor(May/June2010, 8m)
10. Assume that the current content of AR3 is 400h, what will be its contents after each of the
following. Assume that the content of AR0 is 40h. (May/June2010, 8m)
Page 57 21EC733- DSP Algorithms and Architecture notes
Branch Instructions
NOP: No Operation
The timer register (TIM) is a 16-bit memory-mapped register that decrements at every pulse from the
prescaler block (PSC).
The timer period register (PRD) is a 16-bit memory-mapped register whose contents are loaded onto
the TIM whenever the TIM decrements to zero or the device is reset (SRESET).
The timer can also be independently reset using the TRB signal. The timer control register
(TCR) is a 16-bit memory-mapped register that contains status and control bits. Table shows the
functions of the various bits in the TCR.
The prescaler block is also an on-chip counter. Whenever the prescaler bits count down to 0, a
clock pulse is given to the TIM register that decrements the TIM register by 1. The TDDR bits contain
the divide-down ratio, which is loaded onto the prescaler block after each time the prescaler bits count
down to 0.
That is to say that the 4-bit value of TDDR determines the divide-by ratio of the timer clock
with respect to the system clock. In other words, the TIM decrements either at the rate of the system
clock or at a rate slower than that as decided by the value of the TDDR bits. TOUT and TINT are the
output signal generated as the TIM register decrements to 0. TOUT can trigger the start of the
conversion signal in an ADC interfaced to the DSP.
Page 102 21EC733- DSP Algorithms and Architecture notes
The sampling frequency of the ADC determines how frequently it receives the TOUT signal.
TINT is used to generate interrupts, which are required to service a peripheral such as a DRAM
controller periodically. The timer can also be stopped, restarted, reset, or disabled by specific status
bits.
Page 103 21EC733- DSP Algorithms and Architecture notes
The synchronous serial ports are high-speed, full-duplex ports and that provide direct
Page 105 21EC733- DSP Algorithms and Architecture notes
communications with serial devices, such as codec, and analog-to-digital (A/D) converters. A buffered
serial port (BSP) is synchronous serial port that is provided with
an auto buffering unit and is clocked at the full clock rate. The head of servicing interrupts. A time-
division multiplexed (TDM) serial port is a synchronous serial port that is provided to allow time-
division multiplexing of the data. The functioning of each of these on-chip peripherals is controlled by
memory-mapped registers assigned to the respective peripheral.
5 In the read phase the data operand(s), if any, are read from the data buses, DB and CB. This phase
completes the two-phase read process and starts the two phase write processes. The data address of the
Page 106 21EC733- DSP Algorithms and Architecture notes
write operand, if any, is loaded into the data write address bus, EAB.
6 The execute phase writes the data using the data write bus, EB, and completes the operand write
sequence. The instruction is executed in this phase.
Page 107 21EC733- DSP Algorithms and Architecture notes
Recommended Questions: