21EC733- Module 3 DSP Algorithms and Architecture Notes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 74

Page 34 21EC733- DSP Algorithms and Architecture notes

MODULE-3

Programmable Digital Signal Processors

3.1 Introduction:
Leading manufacturers of integrated circuits such as Texas Instruments (TI), Analog devices &
Motorola manufacture the digital signal processor (DSP) chips. These manufacturers have developed a
range of DSP chips with varied complexity.
The TMS320 family consists of two types of single chips DSPs: 16-bit fixed point &32-bit floating-
point. These DSPs possess the operational flexibility of high-speed controllers and the numerical
capability of array processors

3.2 Commercial Digital Signal-Processing Devices:


There are several families of commercial DSP devices. Right from the early eighties, when
these devices began to appear in the market, they have been used in numerous applications, such as
communication, control, computers, Instrumentation, and consumer electronics. The architectural
features and the processing power of these devices have been constantly upgraded based on the
advances in technology and the application needs. However, their basic versions, most of them have
Harvard architecture, a single-cycle hardware multiplier, an address generation unit with dedicated
address registers, special addressing modes, on-chip peripherals interfaces. Of the various families of
programmable DSP devices that are commercially available, the three most popular ones are those
from Texas Instruments, Motorola, and Analog Devices. Texas Instruments was one of the first to
come out with a commercial programmable DSP with the introduction of its TMS32010 in 1982.

Summary of the Architectural Features of three fixed-Points DSPs


Page 35 21EC733- DSP Algorithms and Architecture notes

3.3. The architecture of TMS320C54xx digital signal processors:


TMS320C54xx processors retain in the basic Harvard architecture of their predecessor,
TMS320C25, but have several additional features, which improve their performance over it. Figure 3.1
shows a functional block diagram of TMS320C54xx processors. They have one program and three
data memory spaces with separate buses, which provide simultaneous accesses to program instruction
and two data operands and enables writing of result at the same time. Part of the memory is
implemented on-chip and consists of combinations of ROM, dual-access RAM, and single-access
RAM. Transfers between the memory spaces are also possible.
The central processing unit (CPU) of TMS320C54xx processors consists of a 40- bit arithmetic
logic unit (ALU), two 40-bit accumulators, a barrel shifter, a 17x17 multiplier, a 40-bit adder, data
address generation logic (DAGEN) with its own arithmetic unit, and program address generation logic
(PAGEN). These major functional units are supported by a number of registers and logic in the
architecture. A powerful instruction set with a hardware-supported, single-instruction repeat and block
repeat operations, block memory move instructions, instructions that pack two or three simultaneous
reads, and arithmetic instructions with parallel store and load make these devices very efficient for
running high-speed DSP algorithms.
Several peripherals, such as a clock generator, a hardware timer, a wait state generator, parallel
I/O ports, and serial I/O ports, are also provided on-chip. These peripherals make it convenient to
interface the signal processors to the outside world. In these following sections, we examine in detail
Page 36 21EC733- DSP Algorithms and Architecture notes

the various architectural features of the TMS320C54xx family of processors.

Figure 3.1.Functional architecture for TMS320C54xx processors.


Page 37 21EC733- DSP Algorithms and Architecture notes

3.3.1 Bus Structure:


The performance of a processor gets enhanced with the provision of multiple buses to provide
simultaneous access to various parts of memory or peripherals. The 54xx architecture is built around
four pairs of 16-bit buses with each pair consisting of an address bus and a data bus. As shown in
Figure 3.1, these are The program bus pair (PAB, PB); which carries the instruction code from the
program memory. Three data bus pairs (CAB, CB; DAB, DB; and EAB, EB); which interconnected
the various units within the CPU. In Addition the pair CAB, CB and DAB, DB are used to read from
the data memory, while The pair EAB, EB; carries the data to be written to the memory. The ‘54xx
can generate up to two data-memory addresses per cycle using the two auxiliary register arithmetic
unit (ARAU0 and ARAU1) in the DAGEN block. This enables accessing two operands
simultaneously.

3.3.2 Central Processing Unit (CPU):


The ‘54xx CPU is common to all the ‘54xx devices. The ’54xx CPU contains a 40-bit
arithmetic logic unit (ALU); two 40-bit accumulators (A and B); a barrel shifter; a
17 x 17-bit multiplier; a 40-bit adder; a compare, select and store unit (CSSU); an exponent
encoder(EXP); a data address generation unit (DAGEN); and a program address generation unit
(PAGEN).
The ALU performs 2’s complement arithmetic operations and bit-level Boolean operations on
16, 32, and 40-bit words. It can also function as two separate 16-bit ALUs
and perform two 16-bit operations simultaneously. Figure 3.2 show the functional diagram of the ALU
of the TMS320C54xx family of devices.

Accumulators A and B store the output from the ALU or the multiplier/adder block and provide a
second input to the ALU. Each accumulators is divided into three parts: guards bits (bits 39-32), high-
order word (bits-31-16), and low-order word (bits 15- 0), which can be stored and retrieved
individually. Each accumulator is memory-mapped and partitioned. It can be configured as the
destination registers. The guard bits are used as a head margin for computations.
Page 38 21EC733- DSP Algorithms and Architecture notes

Figure 3.2.Functional diagram of the central processing unit of the TMS320C54xx


processors.

Barrel shifter: provides the capability to scale the data during an operand read or write.
No overhead is required to implement the shift needed for the scaling operations. The’54xx barrel
shifter can produce a left shift of 0 to 31 bits or a right shift of 0 to 16 bits on the input data. The shift
count field of status registers ST1, or in the temporary
register T. Figure 3.3 shows the functional diagram of the barrel shifter of TMS320C54xx processors.
The barrel shifter and the exponent encoder normalize the values in an accumulator in a single cycle.
The LSBs of the output are filled with0s, and the MSBs can be either zero filled or sign extended,
depending on the state of the sign-extension mode bit in the status register ST1. An additional shift
capability enables the processor to perform numerical scaling, bit extraction, extended arithmetic, and
overflow prevention operations.
Page 39 21EC733- DSP Algorithms and Architecture notes

Figure 3.3.Functional diagram of the barrel shifter

Multiplier/adder unit: The kernel of the DSP device architecture is multiplier/adder unit. The
multiplier/adder unit of TMS320C54xx devices performs 17 x 17 2’s complement multiplication with
a 40-bit addition effectively in a single instruction cycle.
In addition to the multiplier and adder, the unit consists of control logic for integer and
fractional computations and a 16-bit temporary storage register, T. Figure 3.4 show the functional
diagram of the multiplier/adder unit of TMS320C54xx processors. The compare, select, and store unit
(CSSU) is a hardware unit specifically incorporated to accelerate the add/compare/select operation.
This operation is essential to implement the Viterbi algorithm used in many signal-processing
applications. The exponent encoder unit supports the EXP instructions, which stores in the T register
the number of leading redundant bits of the accumulator content. This information is useful while
shifting the accumulator content for the purpose of scaling.
Page 40 21EC733- DSP Algorithms and Architecture notes

Figure 3.4. Functional diagram of the multiplier/adder unit of TMS320C54xx processors.

3.3.3 Internal Memory and Memory-Mapped Registers:


The amount and the types of memory of a processor have direct relevance to the efficiency and
performance obtainable in implementations with the processors. The ‘54xx memory is organized into
three individually selectable spaces: program, data, and I/O spaces. All ‘54xx devices contain both
RAM and ROM. RAM can be either dual-access type (DARAM) or single-access type (SARAM). The
on-chip RAM for these processors is organized in pages having 128 word locations on each page.
The ‘54xx processors have a number of CPU registers to support operand addressing and
computations. The CPU registers and peripherals registers are all located on page 0 of the data
Page 41 21EC733- DSP Algorithms and Architecture notes

memory. Figure 3.5(a) and (b) shows the internal CPU registers and peripheral registers with their
addresses. The processors mode status (PMST) registers
that is used to configure the processor. It is a memory-mapped register located at address 1Dh on page
0 of the RAM. A part of on-chip ROM may contain a boot loader and look-up tables for function such
as sine, cosine, μ- law, and A- law.

Figure 3.5(a) Internal memory-mapped registers of TMS320C54xx processors.


Page 42 21EC733- DSP Algorithms and Architecture notes

Figure 3.5(b).peripheral registers for the TMS320C54xx processors

Status registers (ST0,ST1):


ST0: Contains the status of flags (OVA, OVB, C, TC) produced by arithmetic operations
& bit manipulations.
ST1: Contain the status of various conditions & modes. Bits of ST0&ST1registers can be set or clear
with the SSBX & RSBX instructions.
PMST: Contains memory-setup status & control information.
Page 43 21EC733- DSP Algorithms and Architecture notes

ARP: Auxiliary register pointer.


TC: Test/control flag.
C: Carry bit.
OVA: Overflow flag for accumulator A.
OVB: Overflow flag for accumulator B.
DP: Data-memory page pointer.

BRAF: Block repeat active flag


BRAF=0, the block repeat is deactivated.
BRAF=1, the block repeat is activated.

CPL: Compiler mode


CPL=0, the relative direct addressing mode using data page pointer is selected.
CPL=1, the relative direct addressing mode using stack pointer is selected.

HM: Hold mode, indicates whether the processor continues internal execution or acknowledge for
external interface.

INTM: Interrupt mode, it globally masks or enables all interrupts.


INTM=0_all unmasked interrupts are enabled.
INTM=1_all masked interrupts are disabled.
0: Always read as 0

OVM: Overflow mode.


OVM=1_the destination accumulator is set either the most positive value or the most negative value.
OVM=0_the overflowed result is in destination accumulator.

SXM: Sign extension mode.


SXM=0 _Sign extension is suppressed
SXM=1_Data is sign extended
Page 44 21EC733- DSP Algorithms and Architecture notes

C16: Dual 16 bit/double-Precision arithmetic mode.


C16=0_ALU operates in double-Precision arithmetic mode.
C16=1_ALU operates in dual 16-bit arithmetic mode.

FRCT: Fractional mode.


FRCT=1_the multiplier output is left-shifted by 1bit to compensate an extra sign bit.

CMPT: Compatibility mode.


CMPT=0_ ARP is not updated in the indirect addressing mode.
CMPT=1_ARP is updated in the indirect addressing mode.

ASM: Accumulator Shift Mode.


5 bit field, & specifies the Shift value within -16 to 15 range.

Processor Mode Status Register (PMST):

INTR: Interrupt vector pointer, point to the 128-word program page where the interrupt vectors
reside.
MP/MC: Microprocessor/Microcomputer mode,
MP/MC=0, the on chip ROM is enabled.
MP/MC=1, the on chip ROM is enabled.

OVLY: RAM OVERLAY, OVLY enables on chip dual access data RAM blocks to be mapped into
program space.

AVIS: It enables/disables the internal program address to be visible at the address pins.
DROM: Data ROM, DROM enables on-chip ROM to be mapped into data space.
CLKOFF: CLOCKOUT off.

SMUL: Saturation on multiplication.

SST: Saturation on store.


Page 45 21EC733- DSP Algorithms and Architecture notes

3.4 Data Addressing Modes of TMS320C54X Processors:

Data addressing modes provide various ways to access operands to execute instructions and place
results in the memory or the registers. The 54XX devices offer seven basic addressing modes
1. Immediate addressing.
2. Absolute addressing.
3. Accumulator addressing.
4. Direct addressing.
5. Indirect addressing.
6. Memory mapped addressing
7. Stack addressing.

3.4.1 Immediate addressing:


The instruction contains the specific value of the operand. The operand can be short (3,5,8 or 9
bit in length) or long (16 bits in length). The instruction syntax for short operands occupies one
memory location,
Example: LD #20, DP.
RPT #0FFFFh.

3.4.2 Absolute Addressing:


The instruction contains a specified address in the operand.
1. Dmad addressing. MVDK Smem,dmad, MVDM dmad,MMR
2. Pmad addressing. MVDP Smem,pmad, MVPD pmem,Smad
3. PA addressing. PORTR PA, Smem,
4.*(lk) addressing .

3.4.3 Accumulator Addressing:


Accumulator content is used as address to transfer data between Program and Data memory.
Ex: READA *AR2

3.4.4 Direct Addressing:


Base address + 7 bits of value contained in instruction = 16 bit address. A page of 128
locations can be accessed without change in DP or SP.Compiler mode bit (CPL) in ST1 register is
used.
If CPL =0 selects DP
CPL = 1 selects SP,
It should be remembered that when SP is used instead of DP, the effective address is
computed by adding the 7-bit offset to SP.
Page 46 21EC733- DSP Algorithms and Architecture notes

Figure 3.7 Block diagram of the direct addressing mode for TMS320C54xx Processors.

3.4.5 Indirect Addressing:

TMS320C54xx have 8, 16 bit auxiliary register (AR0 – AR 7). Two auxiliary register arithmetic units
(ARAU0 & ARAU1)
Used to access memory location in fixed step size. AR0 register is used for indexed and bit reverse
addressing modes.
– operand addressing
MOD _ type of indirect addressing
ARF _ AR used for addressing
ARP depends on (CMPT) bit in ST1
CMPT = 0, Standard mode, ARP set to zero
CMPT = 1, Compatibility mode, Particularly AR selected by ARP
Page 47 21EC733- DSP Algorithms and Architecture notes
Page 48 21EC733- DSP Algorithms and Architecture notes

Table 3.2 Indirect addressing options with a single data –memory operand.
Circular Addressing;

 Used in convolution, correlation and FIR filters.


 A circular buffer is a sliding window contains most recent data. Circular buffer of size R must
start on a N-bit boundary, where 2N > R .

 Effective base address (EFB): By zeroing the N LSBs of a user selected AR (ARx).

If 0 _ index + step < BK ; index = index +step;
else if index + step _ BK ; index = index + step - BK;
else if index + step < 0; index + step + BK
Page 49 21EC733- DSP Algorithms and Architecture notes
Page 50 21EC733- DSP Algorithms and Architecture notes

Bit-Reversed Addressing:
o Used for FFT algorithms.
o AR0 specifies one half of the size of the FFT.
o The value of AR0 = 2N-1: N = integer FFT size = 2N
o AR0 + AR (selected register) = bit reverse addressing.
o The carry bit propagating from left to right.

Dual-Operand Addressing:
Dual data-memory operand addressing is used for instruction that simultaneously
perform two reads (32-bit read) or a single read (16-bit read) and a parallel store (16-bit
store) indicated by two vertical bars, II. These instructions access operands using indirect addressing
mode.
If in an instruction with a parallel store the source operand the destination operand point to the
same location, the source is read before writing to the destination. Only 2 bits are available in the
instruction code for selecting each auxiliary register in this mode. Thus, just four of the auxiliary
registers, AR2-AR5, can be used, The ARAUs together with these registers, provide capability to
access two operands in a single cycle. Figure 3.11 shows how an address is generated using dual data-
memory operand addressing.
Page 51 21EC733- DSP Algorithms and Architecture notes
Page 52 21EC733- DSP Algorithms and Architecture notes

3.4.6. Memory-Mapped Register Addressing:


 Used to modify the memory-mapped registers without affecting the current data page
 pointer (DP) or stack-pointer (SP)
o Overhead for writing to a register is minimal
o Works for direct and indirect addressing
o Scratch –pad RAM located on data PAGE0 can be modified
 STM #x, DIRECT
 STM #tbl, AR1

3.4.7 Stack Addressing:


• Used to automatically store the program counter during interrupts and subroutines.
• Can be used to store additional items of context or to pass data values.
• Uses a 16-bit memory-mapped register, the stack pointer (SP).
• PSHD X2
Page 53 21EC733- DSP Algorithms and Architecture notes

3.5. Memory Space of TMS320C54xx Processors


 A total of 128k words extendable up to 8192k words.
 Total memory includes RAM, ROM, EPROM, EEPROM or Memory mapped peripherals.
 mapped
registers.
Page 54 21EC733- DSP Algorithms and Architecture notes

Figure 3.14 Memory map for the TMS320C5416 Processor.


Page 55 21EC733- DSP Algorithms and Architecture notes

3.6. Program Control


 It contains program counter (PC), the program counter related H/W, hard stack, repeat
counters &status registers.
 PC addresses memory in several ways namely:
 Branch: The PC is loaded with the immediate value following the branch instruction
 Subroutine call: The PC is loaded with the immediate value following the call instruction
 Interrupt: The PC is loaded with the address of the appropriate interrupt vector.
 Instructions such as BACC, CALA, etc ;The PC is loaded with the contents of the accumulator
low word
 End of a block repeat loop: The PC is loaded with the contents of the block repeat program
address start register.
 Return: The PC is loaded from the top of the stack.

Problems:

1. Assuming the current content of AR3 to be 200h, what will be its contents after
each of the following TMS320C54xx addressing modes is used? Assume that the
contents of AR0 are 20h.
a. *AR3+0
b. *AR3-0
c. *AR3+
d. *AR3
e. *AR3
f. *+AR3 (40h)
g. *+AR3 (-40h)
Solution:
a. AR3 ← AR3 + AR0;
AR3 = 200h + 20h = 220h
b. AR3← AR3 - AR0;
AR3 = 200h - 20h = 1E0h
c. AR3 ← AR3 + 1;
AR3 = 200h + 1 = 201h
d. AR3 ← AR3 - 1;
AR3 = 200h - 1 = 1FFh
e. AR3 is not modified.
AR3 = 200h
f. AR3 ← AR3 + 40h;
AR3 = 200 + 40h = 240h
g. AR3 ← AR3 - 40h;
AR3 = 200 - 40h = 1C0h
Page 56 21EC733- DSP Algorithms and Architecture notes

2. Assuming the current contents of AR3 to be 200h, what will be its contents after
each of the following TMS320C54xx addressing modes is used? Assume that the contents of AR0 are
20h
a. *AR3 + 0B
b. *AR3 – 0B
Solution:
a. AR3 ← AR3 + AR0 with reverse carry propagation;
AR3 = 200h + 20h (with reverse carry propagation) = 220h.
b. AR3 ← AR3 - AR0 with reverse carry propagation;
AR3 = 200h - 20h (with reverse carry propagation) = 23Fh.

Recommended Questions:
1. Compare architectural features of TMS320C25 and DSP6000 fixed point digital signal
processors. (Dec.09-Jan.10, 6m)
2. Write an explanatory note on direct addressing mode of TMS320C54XX processors. Give
example. (Dec.09-Jan.10, 6m)
3. Describe the operation of the following instructions of TMS320C54XX processors.
i) MPY *AR2-,*AR4+0B (ii) MAC *ar5+,#1234h,A (iii) STH A,1,*AR2 iv) SSBX
SXM (Dec.09-Jan.10, 8m)
4. With a block diagram explain the indirect addressing mode of TMS320C54XX processor using
dual data memory operand. (June.12, 6m)
5. What is the function of an address generation unit explain with the help of block diagram.
(Dec.12, 6m)
6. Why circular buffers are required in DSP processor? How they are implemented? (Dec.12, 2m)
7. Explain the direct addressing mode of the TMS320C54XX processor with the help of a block
diagram. (Dec.12, 2m)
8. Describe the multiplier/adder unit of TMS320c54xx processor with a neat block diagram.
(May/June2010, 6m)
9. Describe any four data addressing modes of TMS320c54xx processor(May/June2010, 8m)

10. Assume that the current content of AR3 is 400h, what will be its contents after each of the
following. Assume that the content of AR0 is 40h. (May/June2010, 8m)
Page 57 21EC733- DSP Algorithms and Architecture notes

11. Explain PMST register. (May/June2011, 8m)


12. With an example each, explain immediate, absolute, and direct addressing
mode.(May/June2011, 12m)
13. Explain the functioning of barrel shifter in TMS320C54XX processor. (June.12, 6m)
14. Explain sequential and other types of program control(June.11, 7m)
15. With an example each, explain immediate, absolute, and direct addressing mode.
16. Explain the functioning of barrel shifter in TMS320C54XX processor.
17. Explain sequential and other types of program control
18. Assume that the current content of AR3 is 400h, what will be its contents after each of the
following. Assume that the content of AR0 is 40h.
19. Explain PMST register.
20. Compare architectural features of TMS320C25 and DSP6000 fixed point digital signal
processors.
Page 58 21EC733- DSP Algorithms and Architecture notes

Instruction and programming

4.1 Assembly language instructions can be classified as:


 Arithmetic operations
 Load and store instructions.
 Logical operations
 Program-control operations
Page 59 21EC733- DSP Algorithms and Architecture notes

4.1.1 Arithmetic Instructions:


Page 60 21EC733- DSP Algorithms and Architecture notes
Page 61 21EC733- DSP Algorithms and Architecture notes
Page 62 21EC733- DSP Algorithms and Architecture notes
Page 63 21EC733- DSP Algorithms and Architecture notes
Page 64 21EC733- DSP Algorithms and Architecture notes
Page 65 21EC733- DSP Algorithms and Architecture notes
Page 66 21EC733- DSP Algorithms and Architecture notes
Page 67 21EC733- DSP Algorithms and Architecture notes
Page 68 21EC733- DSP Algorithms and Architecture notes
Page 69 21EC733- DSP Algorithms and Architecture notes
Page 70 21EC733- DSP Algorithms and Architecture notes
Page 71 21EC733- DSP Algorithms and Architecture notes
Page 72 21EC733- DSP Algorithms and Architecture notes
Page 73 21EC733- DSP Algorithms and Architecture notes
Page 74 21EC733- DSP Algorithms and Architecture notes
Page 75 21EC733- DSP Algorithms and Architecture notes
Page 76 21EC733- DSP Algorithms and Architecture notes
Page 77 21EC733- DSP Algorithms and Architecture notes
Page 78 21EC733- DSP Algorithms and Architecture notes
Page 79 21EC733- DSP Algorithms and Architecture notes
Page 80 21EC733- DSP Algorithms and Architecture notes
Page 81 21EC733- DSP Algorithms and Architecture notes
Page 82 21EC733- DSP Algorithms and Architecture notes
Page 83 21EC733- DSP Algorithms and Architecture notes
Page 84 21EC733- DSP Algorithms and Architecture notes
Page 85 21EC733- DSP Algorithms and Architecture notes
Page 86 21EC733- DSP Algorithms and Architecture notes
Page 87 21EC733- DSP Algorithms and Architecture notes
Page 88 21EC733- DSP Algorithms and Architecture notes

MVPD: Move Data From Program Memory to Data Memory

PORTR: Read Data from Port

PORTW: Write Data to Port


Page 89 21EC733- DSP Algorithms and Architecture notes

READA: Read Program Memory addressed by Accumulator A and Store in Data


Memory

WRITA: Write Data to Program Memory Addressed by Accumulator A

Branch Instructions

B[D]: Branch Unconditionally

BACC[D]: Branch to Location Specified by Accumulator


Page 90 21EC733- DSP Algorithms and Architecture notes

BANZ[D]: Branch on Auxiliary Register Not Zero

BC [D]: Branch Conditionally

FB [D]: Far Branch Unconditionally

FBACC [D]: Far Branch to Location Specified by Accumulator


Page 91 21EC733- DSP Algorithms and Architecture notes

CALA [D]: Call Subroutine at Location Specified by Accumulator

CALL[D]: Call Unconditionally

CC [D]: Call Conditionally


Page 92 21EC733- DSP Algorithms and Architecture notes
Page 93 21EC733- DSP Algorithms and Architecture notes

FCALA [D]: Far Call Subroutine at Location Specified by Accumulator


Page 94 21EC733- DSP Algorithms and Architecture notes

4.1.5. Interrupt Instructions:

INTR: Software Interrupt

TRAP: Software Interrupt


Page 95 21EC733- DSP Algorithms and Architecture notes

4.1.6. Return Instructions

FRET [D]: Far Return

FRETE [D]: Enable Interrupts and Far Return From Interrupt

RC [D]: Return Conditionally


Page 96 21EC733- DSP Algorithms and Architecture notes
Page 97 21EC733- DSP Algorithms and Architecture notes

RET [D]: Return

RETF [D]: Enable Interrupts and Fast Return From Interrupt

4.1.7. Repeat Instructions

RPT: Repeat Next Instruction

RPTB [D]: Block Repeat


Page 98 21EC733- DSP Algorithms and Architecture notes

RPTZ: Repeat Next Instruction and Clear Accumulator

4.1.8. Stack-Manipulating Instructions

FRAME: Stack Pointer Immediate Offset

POPD: Pop Top of Stack to Data Memory


Page 99 21EC733- DSP Algorithms and Architecture notes

POPM: Pop Top of Stack to Memory-Mapped Register

PSHD: Push Data-Memory Value onto Stack

PSHM: Push Memory-Mapped Register onto Stack

4.1.9. Miscellaneous Program-Control Instructions

SSBX: Set Status Register Bit

RSBX: Reset Status Register Bit


Page 100 21EC733- DSP Algorithms and Architecture notes

NOP: No Operation

RESET: Software Reset


Page 101 21EC733- DSP Algorithms and Architecture notes

4.3. On chip peripherals:

It facilitates interfacing with external devices. The peripherals are:


 General purpose I/O pins
 A software programmable wait state generator.
 Hardware timer
 Host port interface (HPI)
 Clock generator
 Serial port

4.3.1 It has two general purpose I/O pins:

 BIO-input pin used to monitor the status of external devices.


 XF- output pin, software controlled used to signal external devices

4.3.2. Software programmable wait state generator:


 Extends external bus cycles up to seven machine cycles.

4.3.3. Hardware Timer





of 3 memory mapped registers:
 The timer register (TIM)
 Timer period register (PRD)
 Timer controls register (TCR)
• Pre scaler block (PSC).
• TDDR (Time Divide Down ratio)
• TIN &TOUT

The timer register (TIM) is a 16-bit memory-mapped register that decrements at every pulse from the
prescaler block (PSC).
The timer period register (PRD) is a 16-bit memory-mapped register whose contents are loaded onto
the TIM whenever the TIM decrements to zero or the device is reset (SRESET).
The timer can also be independently reset using the TRB signal. The timer control register
(TCR) is a 16-bit memory-mapped register that contains status and control bits. Table shows the
functions of the various bits in the TCR.
The prescaler block is also an on-chip counter. Whenever the prescaler bits count down to 0, a
clock pulse is given to the TIM register that decrements the TIM register by 1. The TDDR bits contain
the divide-down ratio, which is loaded onto the prescaler block after each time the prescaler bits count
down to 0.
That is to say that the 4-bit value of TDDR determines the divide-by ratio of the timer clock
with respect to the system clock. In other words, the TIM decrements either at the rate of the system
clock or at a rate slower than that as decided by the value of the TDDR bits. TOUT and TINT are the
output signal generated as the TIM register decrements to 0. TOUT can trigger the start of the
conversion signal in an ADC interfaced to the DSP.
Page 102 21EC733- DSP Algorithms and Architecture notes

The sampling frequency of the ADC determines how frequently it receives the TOUT signal.
TINT is used to generate interrupts, which are required to service a peripheral such as a DRAM
controller periodically. The timer can also be stopped, restarted, reset, or disabled by specific status
bits.
Page 103 21EC733- DSP Algorithms and Architecture notes

4.3.4. Host port interface (HPI):

• Allows to interface to an 8bit or 16bit host devices or a host processor


• Signals in HPI are:
• Host interrupt (HINT)
• HRDY
• HCNTL0 &HCNTL1
• HBIL
• HR/w
Page 104 21EC733- DSP Algorithms and Architecture notes

Important signals in the HPI are as follows:


• The 16-bit data bus and the 18-bit address bus.
• The host interrupt, Hint, for the DSP to signal the host when it attention is required.
• HRDY, a DSP output indicating that the DSP is ready for transfer.
• HCNTL0 and HCNTL1, control signal that indicate the type of transfer to carry out. The
transfer types are data, address, etc.
• HBIL. If this is low it indicates that the current byte is the first byte; if it is high, it
indicates that it is second byte.
• HR/W indicates if the host is carrying out a read operation or a write operation

4.3.5. Clock Generator:


The clock generator on TMS320C54xx devices has two options-an external clock
and the internal clock. In the case of the external clock option, a clock source is directly connected to
the device. The internal clock source option, on the other hand, uses an internal clock generator and a
phase locked loop (PLL) circuit. The PLL, in turn, can be hardware configured or software
programmed. Not all devices of the TMS320C54xx family have all these clock options; they vary
from device to device.
4.3.6. Serial I/O Ports:
Three types of serial ports are available:
• Synchronous ports.
• Buffered ports.
• Time-division multiplexed ports.

The synchronous serial ports are high-speed, full-duplex ports and that provide direct
Page 105 21EC733- DSP Algorithms and Architecture notes

communications with serial devices, such as codec, and analog-to-digital (A/D) converters. A buffered
serial port (BSP) is synchronous serial port that is provided with
an auto buffering unit and is clocked at the full clock rate. The head of servicing interrupts. A time-
division multiplexed (TDM) serial port is a synchronous serial port that is provided to allow time-
division multiplexing of the data. The functioning of each of these on-chip peripherals is controlled by
memory-mapped registers assigned to the respective peripheral.

4.4. Interrupts of TMS320C54xx Processors:


Many times, when CPU is in the midst of executing a program, a peripheral device may require
a service from the CPU. In such a situation, the main program may be interrupted by a signal
generated by the peripheral devices. This results in the processor suspending the main program in
order to execute another program, called interrupt service routine, to service the peripheral device. On
completion of the interrupt service routine, the processor returns to the main program to continue from
where it left.
Interrupt may be generated either by an internal or an external device. It may also be generated by
software. Not all interrupts are serviced when they occur. Only those interrupts that are called
nonmaskable are serviced whenever they occur. Other interrupts, which are called maskable interrupts,
are serviced only if they are enabled. There is also a priority to determine which interrupt gets serviced
first if more than one interrupts occur simultaneously.
Almost all the devices of TMS320C54xx family have 32 interrupts. However, the
types and the number under each type vary from device to device. Some of these interrupts are
reserved for use by the CPU.

4.5. Pipeline operation of TMS320C54xx Processors:


The CPU of ‘54xx devices have a six-level-deep instruction pipeline. The six stages of the
pipeline are independent of each other. This allows overlapping execution of instructions. During any
given cycle, up to six different instructions can be active, each at a different stage of processing. The
six levels of the pipeline structure are program prefetch, program fetch, decode, access, read and
execute.
1 During program prefetch, the program address bus, PAB, is loaded with the address of the next
instruction to be fetched.
2 In the fetch phase, an instruction word is fetched from the program bus, PB, and loaded into the
instruction register, IR. These two phases from the instruction fetch sequence.
3 During the decode stage, the contents of the instruction register, IR are decoded to determine the
type of memory access operation and the control signals required for the data-address generation unit
and the CPU.
4 The access phase outputs the read operand’s on the data address bus, DAB. If a second operand is
required, the other data address bus, CAB, also loaded with an appropriate address. Auxiliary
registers in indirect addressing mode and the stack pointer (SP) are also updated.

5 In the read phase the data operand(s), if any, are read from the data buses, DB and CB. This phase
completes the two-phase read process and starts the two phase write processes. The data address of the
Page 106 21EC733- DSP Algorithms and Architecture notes

write operand, if any, is loaded into the data write address bus, EAB.
6 The execute phase writes the data using the data write bus, EB, and completes the operand write
sequence. The instruction is executed in this phase.
Page 107 21EC733- DSP Algorithms and Architecture notes

Recommended Questions:

1. Describe Host Port Interface and explain its signals.


2. writes an assembly language program of TMS320C54XX processors to compute the sum of
three product terms given by the equation y(n)=h(0)x(n)+h(1)x(n-1)+h(2)x(n-2) with usual
notations. Find y (n) for signed 16 bit data samples and 16 bit constants.
3. Describe the pipelining operation of TMS320C54XX processors.
4. Explain the operation of serial I/O ports and hardware timer of TMS320C54XX on chip
peripherals.
5. Expalin the differents types ofinterrupts in TMS320C54xx Processors.
6. Describe the operation of the following instructions of TMS 320c54xx processor, with example
Describe the operation of hardware timer with neat diagram.
7. By means of a figure explain the pipeline operation of the following sequence of instruction if
the initial values of AR1,AR3,A are 104,101,2 and the values stored in the memory locations
101,102,103,104 are 4,6,8,12. Also provide the values of registers AR3, AR1,T & A.
8. Describe the operation of the following instructions of TMS320C54XX processors.
9. Describe the operation of the following instructions of TMS320C54XX processors. (July 12,
8m)
10. Explain the following assembler directives of TMS320C54XX processors (i) .mmregs (ii)
.global (iii) .include ‘xx’ (iv) .data ( v) .end (vi) .bss (Dec 09/Jan 10 6marks)
11. Describe Host Port Interface and explain its signals. (Dec 09/Jan 10 6marks)
12. writes an assembly language program of TMS320C54XX processors to compute the sum of
three product terms given by the equation y(n)=h(0)x(n)+h(1)x(n-1)+h(2)x(n-2) with usual
notations. Find y (n) for signed 16 bit data samples and 16 bit constants. (May/June 2011,
6m)
13. Describe the pipelining operation of TMS320C54XX processors.(Dec.11, 8m)
14. Explain the operation of serial I/O ports and hardware timer of TMS320C54XX on chip
peripherals. (Dec.11, 8m)
15. Expalin the differents types ofinterrupts in TMS320C54xx Processors.(May/June 2009, 6m)

You might also like