Architecture of c5x
Architecture of c5x
Architecture of c5x
TMS320C5X
Introduction
• Leading manufacturers of integrated circuits such as Texas Instruments (TI), Analog Devices and
Motorola manufacture the digital signal processor (DSP) chips. These manufacturers have
developed a range of DSP chips with varied complexity.
• The TMS320 DSP family consists of two types of single-chip DSPs: 16-bit fixed-point and 32-bit
floating-point.
• These DSPs possess the operational flexibility of high-speed controllers and the numerical
capability of array processors.
• Combining these two qualities, the TMS320 processors are inexpensive alternatives to custom
fabricated VLSI and multichip bit-slice processors.
• TMS320C5X belongs to the fifth generation of the TI’s TMS320 family of DSPs. The first five
generations of TMS320 family are C1X, C2X, C3X, C4X and C5X. The C1X, C2X, C2XX and
C5X are 16-bit fixed-point processors.
• Instruction sets of the higher generation fixed-point processors are upward compatible to the lower
generation fixed-point processors.
Introduction
• For example C5X can execute the instructions of both C1X and C2X. The 54X is upward
compatible with 5X. C3X and C4X are 32-bit floating-point processors and C4X is upward
compatible with C3X instruction set.
• The sixth generation C6X devices feature VelociTI™, an advanced very long instruction word
(VLIW) architecture developed by TI and can execute 1600 MIPS.
• The eighth generation C8X devices, have, on a single piece of silicon, a number of advanced
DSPs (ADSPs) and a RISC master processor.
• Typical application of the above families of TI DSPs are as follows:
• C1X, C2X, C2XX, C5X, C54X: toys, hard disk drives, modems, cellular phones and active car
suspensions
• C3X: filters, analysers, hi-fi systems, voice mail, imaging, bar-code readers, motor control, 3D
graphics or scientific processing
• C4X: parallel-processing clusters in virtual reality, image recognition telecom routing, and parallel
processing systems.
• C6X: wireless base stations, pooled modems, remote-access servers, digital subscriber loop
systems, cable modems and multichannel telephone systems
• C8X: video telephony, 3D computer graphics, virtual reality and a number of multimedia
applications
Introduction
• The instruction set of TMS320C5X and other DSP chips is superior to the instruction set of
conventional microprocessors such as 8085, Z80, etc., as most of the instructions require only a
single cycle for execution.
• The multiply accumulate operation used quite frequently in signal processing applications such as
convolution requires only one cycle in DSP.
Architecture of TMS320C5X DSPs
• The 320C5X DSPs are said to have advanced Harvard architecture because they
have separate memory bus structures for program and data and have instructions that
enable data transfer between the program and data memory area.
• Separate program and data buses allow simultaneous access to program instructions and data,
providing a high degree of parallelism.
• The C5X architecture has four buses and their functions are as follows:
• Program bus (PB)--It carries the instruction code and immediate operands from program memory
space to the CPU.
• Program address bus (PAB)--It provides addresses to program memory space for both reads and
writes.
• Data read bus (DB)--It interconnects various elements of the CPU to data memory space.
• Data read address bus (DAB)--It provides the address to access the data memory space. The program
and data buses can work together to transfer data from on-chip data memory and internal or external
program memory to the multiplier for single-cycle multiply/accumulate operations.
Internal Architecture of C5X
• Some of the registers/execution units in the CPU of C5X DSP processors and their functions are as follows.
CENTRAL ARITHMETIC LOGIC UNIT (CALU)
• It consists of the following elements: (16xl6)-bit parallel multiplier, arithmetic logic unit (ALU), accumulator
(ACC), accumulator buffer (ACCB), product register (PREG) each with 32 bits and 0-16- bit left barrel
shifter and right barrel shifter.
• One of the operands for the ALU operation comes from ACC. The result of operations performed in central
ALU are stored in ACC. Either the higher order word or lower order word of ACC can be loaded from
memory. A 32-bit register denoted as ACCB is used for temporary storage of ACC.
• The hardware multiplier unit in the C5X processors performs 16 x 16 multiplication of numbers represented
in 2’s complement form.
• The 32-bit PREG holds the result of multiplication. The 16-bit temporary register 0 (TREG0) holds the
multiplicand.
• The other operand for the multiplication can be specified using one of the addressing modes.
• 0-16-bit left barrel shifter and right barrel shifter in CALU permit the contents of memory to be left shifted by
0 to 16 bits before they are either fed to ALU or stored from ALU to memory.
• The CPU registers ACC and PREG can also be shifted using these shifters. In this case they require two
cycles.
• A 5-bit register TREG1 specifies the number of bits by which the scaling shifter should shift either the
incoming data to one of the CPU registers or vice versa. When the incoming data to CPU is left shifted by the
scaling shifter the LSBs are filled with 0.
AUXILIARY REGISTER ALU (ARAU)
• It consists of eight 16-bit auxiliary registers (ARs) AR0-AR7, a 3-bit auxiliary register pointer (ARP) and
an unsigned 16-bit ALU. ARAU calculates indirect addresses by using inputs from ARs, 16-bit index
register (INDX) and auxiliary register compare register (ARCR).
• The ARAU can auto index the current AR while the data memory location is being addressed and can
index either by ± 1 or by the contents of the INDX.
• As a result, accessing data does not require the CALU for address manipulation; therefore, the CALU is
free for other operations in parallel.
• This makes the instructions to be executed faster compared to the conventional microprocessors.
• For example, let us consider the following sequence of 8085 instructions:
• M0V A,M
• INX H
• These instructions enable the accumulator to be loaded using indirect addressing mode and HL register
used as the address pointer is incremented. These two instructions can be replaced by a single 5X
instruction LACC *+, 0.
• Further, any one of the auxiliary registers can be used as the address pointer and incremented by the above
instruction. The register that will be used is specified by the content of the ARP.
• The auxiliary registers AR0-AR7 may also be used as the general purpose registers for holding the operands for
arithmetic and logical operations in CALU. Some of the other registers of ARAU and their functions are as
follows:
• INDEX REGISTER (INDX)
• The 16-bit INDX is used by the ARAU as a step value (addition or subtraction by more than 1) to modify the
address in the ARs during indirect addressing. For example, when the ARAU steps across a row of a matrix, the
indirect address is incremented by 1.
• However, when the ARAU steps down a column, the address is incremented by the dimension of the matrix.
• The ARAU can add or subtract the value stored in the INDX from the current AR as part of the indirect address
operation.
• INDX can also map the dimension of the address block used for bit-reversal addressing.
• MEMORY-MAPPED REGISTERS
• The ‘C5X has 96 registers mapped into page 0 of the data memory space. All ‘C5X DSPs have 28 CPU
registers and 16 input/output (I/O) port registers but have different numbers of peripheral and reserved
registers.
• Since the memory-mapped registers are a component of the data memory space, they can be written to
and read from in the same way as any other data memory location.
• The memory-mapped registers are used for indirect data address pointers, temporary storage, CPU
status and control, or integer arithmetic processing through the ARAU.
• PROGRAM CONTROLLER
• The program controller contains logic circuitry that decodes the instructions, manages the CPU
pipeline, stores the status of CPU operations and decodes the conditional operations.
• Parallelism of architecture lets the ¢C5X perform three concurrent memory operations in any given
machine cycle: fetch an instruction, read an operand and write an operand.
• The program controller consists of the following elements:
• 16-bit program counter (PC)
• 16-bit status registers ST0, ST1, processor mode status register (PMST) and circular buffer control
• register (CBCR)
• (8 x 16)-bit hardware stack
• Address generation logic
• Instruction register
• Interrupt flag register and interrupt mask register
SOME FLAGS IN THE STATUS REGISTERS
• The status registers can be stored into data memory and loaded from data memory, thereby allowing the ‘C5X
status to be saved and restored for subroutines. The ST0 and ST1 each have an associated 1-level deep shadow
register stack for automatic context-saving when an interrupt trap is taken.
• These registers are automatically restored upon a return from interrupt.
• The bit assignment details for ST0 and ST1 are given in Fig. Significance of the various bits of ST0 and ST are
as follows:
• ARP (Auxiliary Register Pointer) These bits select the AR to be used in indirect addressing. When the ARP is
loaded, the previous ARP value is copied to the auxiliary register buffer (ARB) in ST1.
• OV (Overflow) flag bit This bit indicates that an arithmetic operation overflow in the ALU.
• OVM (Overflow Mode) bit This bit enables/disables the accumulator overflow saturation mode in the ALU.
• INTM (Interrupt Mode) bit--This bit globally masks or enables all interrupts. The INTM bit has no effect on the
non-maskable RS and NMI interrupts.
• DP (Data Memory Page Pointer) bits These bits specify the address of the current data memory page. The DP
bits are concatenated with the 7 LSBs of an instruction word to form a direct memory address of 16 bits.
• ARB Auxiliary Register Buffer
• This 3-bit field holds the previous value contained in the ARP in ST0. Whenever the
ARP is loaded, the previous ARP value is copied to the ARB, except when using the
LST #0 instruction.
• When the ARB is loaded using the LST #1 instruction, the same value is also copied
to the ARP.
• This is useful when restoring context (when not using the automatic context save) in
a subroutine that modifies the current ARP.
• CNF On-chip RAM configuration control bit
• This 1-bit field enables the on-chip dual-access RAM block 0 (DARAM B0) to be
addressable in data memory space or program memory space
• TC Test/control flag bit
• This 1-bit flag stores the results of the ALU or parallel logic unit (PLU) test bit operations.
• The status of the TC bit determines if the conditional branch, call and return instructions are
to be executed.
• SXM Sign-extension mode bit
• This 1-bit field enables/disables sign extension of an arithmetic operation.
• The SXM bit does not affect the operations of certain arithmetic or logical
instructions; the ADDC, ADDS, SUBB or SUBS instruction suppresses sign
extension, regardless of SXM.
• C Carry bit---This 1-bit field indicates an arithmetic operation carry or borrow in the
ALU. The single bit shift and rotate instructions affect the C bit.
• HM Hold mode bit ---This 1-bit field determines whether the central processing unit
(CPU) stops or continues execution when acknowledging an active HOLD
• XF pin status bit---This 1-bit field determines the level of the external flag (XF)
output pin.
• PM Product shift mode bits--This 2-bit field determines the product shifter (P-
SCALER) mode and shift value for the PREG output into the ALU.
ON-CHIP MEMORY
• The C5X architecture contains a considerable amount of on-chip memory to aid in system performance
and integration:
• Program Read-Only Memory (ROM)
• Data/Program Dual-Access RAM (DARAM)
• Data/Program Single-Access RAM (SARAM)
• Program ROM----All ‘C5X DSPs carry a 16-bit on-chip maskable programmable ROM. This memory
is used for booting program code from slower external ROM or EPROM to fast on-chip or external
RAM
• Data/Program Dual-Access RAM----All C5X DSPs carry a 1056-word x 16-bit on-chip dual-access
RAM (DARAM).
• The DARAM is divided into three individually selectable memory blocks: 512-word data or program
DARAM block B0, 512-word data DARAM block B1 and 32-word data DARAM block B2.
• The DARAM is primarily intended to store data values but, when needed, can be used to store
programs as well.
• DARAM blocks B1 and B2 are always configured as data memory; however. DARAM block B0 can
be configured by software as data or program memory.
• Data/Program Single-Access RAM
• Almost all C5X DSPs carry a 16-bit on-chip single-access RAM (SARAM) of sizes
varying from 1-9K (16–bits) words.
• Code can be booted from an off-chip ROM and then executed at full speed once it
is loaded into the on-chip SARAM.
• The SARAM can be configured by software as data memory, as program memory
or combination of both data memory and program memory.
• On-Chip Memory Protection
• The C5X DSPs have a maskable option that protects the contents of on-chip
memories.
• When the related bit is set, no externally originating instruction can access the on-
chip memory spaces.
ON-CHIP PERIPHERALS
• All C5X DSPs have the same CPU structure; however, they have different on-chip
peripherals connected to their CPUs.
• The ‘C5X DSP on-chip peripherals available are as follows:
• Clock Generator
• Hardware Timer
• Software-Programmable Wait-State Generators
• Parallel I/O Ports
• Host Port Interface (HPI)
• Serial Port
• Buffered Serial Port (BSP)
• Time-Division Multiplexed (TDM) Serial Port
• User-Maskable Interrupts