C28x Workshop
C28x Workshop
C28x Workshop
Important Notice
Important Notice
Texas Instruments and its subsidiaries (TI) reserve the right to make changes to their products or to discontinue any product or service without notice, and advise customers to obtain the latest version of relevant information to verify, before placing orders, that information being relied on is current and complete. All products are sold subject to the terms and conditions of sale supplied at the time of order acknowledgment, including those pertaining to warranty, patent infringement, and limitation of liability. TI warrants performance of its semiconductor products to the specifications applicable at the time of sale in accordance with TIs standard warranty. Testing and other quality control techniques are utilized to the extent TI deems necessary to support this warranty. Specific testing of all parameters of each device is not necessarily performed, except those mandated by government requirements. Customers are responsible for their applications using TI components. In order to minimize risks associated with the customers applications, adequate design and operating safeguards must be provided by the customer to minimize inherent or procedural hazards. TI assumes no liability for applications assistance or customer product design. TI does not warrant or represent that any license, either express or implied, is granted under any patent right, copyright, mask work right, or other intellectual property right of TI covering or relating to any combination, machine, or process in which such semiconductor products or services might be or are used. TIs publication of information regarding any third partys products or services does not constitute TIs approval, warranty or endorsement thereof.
Revision History
October 2001 Revision 1.0 January 2002 Revision 2.0 May 2002 Revision 3.0 June 2002 Revision 3.1 October 2002 Revision 4.0 December 2002 Revision 4.1 July 2003 Revision 4.2 August 2003 Revision 4.21 February 2004 Revision 5.0 May 2004 Revision 5.1 January 2005 Revision 5.2 June 2005 Revision 6.0 September 2005 Revision 6.1 October 2005 Revision 6.2 May 2006 Revision 6.21 February 2007 Revision 6.22 July 2008 Revision 7.0 October 2008 Revision 7.1 February 2009 Revision 7.2
Mailing Address
Texas Instruments Training Technical Organization 7839 Churchill Way M/S 3984 Dallas, Texas 75251-1903
ii
Introductions
Introductions
Name Company Project Responsibilities DSP / Microcontroller Experience TMS320 DSP Experience Hardware / Software - Assembly / C Interests
iii
SCI-A (P12)
iv
Architecture Overview
Introduction
This architecture overview introduces the basic architecture of the TMS320C28x (C28x) series of microcontrollers from Texas Instruments. The C28x series adds a new level of general purpose processing ability unseen in any previous DSP chips. The C28x is ideal for applications combining digital signal processing, microcontroller processing, efficient C code execution, and operating system tasks. Unless otherwise noted, the terms C28x and C2833x refer to TMS320F2833x (with FPU) and TMS320F2823x (without FPU) devices throughout the remainder of these notes. For specific details and differences please refer to the device data sheet and users guide.
Learning Objectives
When this module is complete, you should have a basic understanding of the C28x architecture and how all of its components work together to create a high-end, uniprocessor control system.
Learning Objectives
Review the C28x block diagram and device features Describe the C28x bus structure and memory map Identify the various memory blocks on the C28x Identify the peripherals available on the C28x
1-1
Module Topics
Module Topics
Architecture Overview.............................................................................................................................. 1-1 Module Topics......................................................................................................................................... 1-2 What is the TMS320C28x?...................................................................................................................... 1-3 TMS320C28x Internal Bussing .......................................................................................................... 1-4 C28x CPU ............................................................................................................................................... 1-5 Special Instructions............................................................................................................................. 1-6 Pipeline Advantage............................................................................................................................. 1-7 FPU Pipeline....................................................................................................................................... 1-8 Memory ................................................................................................................................................... 1-9 Memory Map ...................................................................................................................................... 1-9 Code Security Module (CSM) ...........................................................................................................1-10 Peripherals .........................................................................................................................................1-10 Fast Interrupt Response .........................................................................................................................1-11 C28x Mode.............................................................................................................................................1-12 Reset.......................................................................................................................................................1-13 Summary ................................................................................................................................................1-14
1-2
RAM
Boot ROM
DMA 6 Ch.
XINTF
DMA Bus
32-bit R-M-W 32x32 bit Auxiliary Atomic FPU Multiplier Registers ALU Real-Time JTAG Emulation Register Bus CPU
The C28x architecture can be divided into 3 functional blocks: CPU and busing Memory Peripherals
1-3
Peripherals
Register Bus / Result Bus Data/Program-write Data Bus (32) Data-write Address Bus (32)
External Interface
The 32-bit-wide data busses enable single cycle 32-bit operations. This multiple bus architecture, known as a Harvard Bus Architecture enables the C28x to fetch an instruction, read a data value and write a data value in a single cycle. All peripherals and memories are attached to the memory bus and will prioritize memory accesses.
1-4
C28x CPU
C28x CPU
The C28x is a highly integrated, high performance solution for demanding control applications. The C28x is a cross between a general purpose microcontroller and a digital signal processor, balancing the code density of a RISC processor and the execution speed of a DSP with the architecture, firmware, and development tools of a microcontroller. The DSP features include a modified Harvard architecture and circular addressing. The RISC features are single-cycle instruction execution, register-to-register operations, and a modified Harvard architecture. The microcontroller features include ease of use through an intuitive instruction set, byte packing and unpacking, and bit manipulation.
C28x + FPU
Program Bus
32-bit fixed-point CPU + FPU 32x32 fixed-point MAC, doubles as dual 16x16 MAC IEEE Single-precision floating point hardware and MAC Floating-point simplifies software development and boosts performance Fast interrupt service time Single cycle read-modify-write instructions Unique real-time debugging capabilities
The C28x design supports an efficient C engine with hardware that allows the C compiler to generate compact code. Multiple busses and an internal register bus allow an efficient and flexible way to operate on the data. The architecture is also supported by powerful addressing modes, which allow the compiler as well as the assembly programmer to generate compact code that is almost one to one corresponded to the C code. The C28x is as efficient in DSP math tasks as it is in system control tasks. This efficiency removes the need for a second processor in many systems. The 32 x 32-bit MAC capabilities of the C28x and its 64-bit processing capabilities, enable the C28x to efficiently handle higher numerical resolution problems that would otherwise demand a more expensive solution. Along with this is the capability to perform two 16 x 16-bit multiply accumulate instructions simultaneously or Dual MACs (DMAC). Also, some devices feature a floating-point unit. The, C28x is source code compatible with the 24x/240x devices and previously written code can be reassembled to run on a C28x device, allowing for migration of existing code onto the C28x.
1-5
C28x CPU
Special Instructions
Simpler programming Smaller, faster code Mem Uninterruptible (Atomic) More efficient compiler
Registers
CPU
ALU / MPY
WRITE STORE
Standard Load/Store
DINT MOV AND MOV EINT AL,*XAR2 AL,#1234h *XAR2,AL
Atomic Read/Modify/Write
AND *XAR2,#1234h
2 words / 1 cycles
6 words / 6 cycles
Atomics are small common instructions that are non-interuptable. The atomic ALU capability supports instructions and code that manages tasks and processes. These instructions usually execute several cycles faster than traditional coding.
1-6
C28x CPU
Pipeline Advantage
C28x Pipeline
A F1 F2 D 1 D2 R1 R 2 E W B C D E F G H
F1: Instruction Address F2: Instruction Content D1: Decode Instruction D2: Resolve Operand Addr R1: Operand Address R2: Get Operand E: CPU doing real work W: store content to memory
8-stage pipeline
W
F 1 F 2 D1 D 2 R1 R2 E W F1 F2 D 1 D2 R1 R 2 E F 1 F 2 D1 D 2 R1 R2 E W F1 F2 D1 D2 R1 R 2 E W F1 F 2 D1 D2 R1 R2 E W F1 F2 D1 D2 R 1 R 2 R2 W W R E E F1 F 2 D1 D2 R1 R2 R 2 W W R1 E E Protected Pipeline Order of results are as written in source code Programmer need not worry about the pipeline
The C28x uses a special 8-stage protected pipeline to maximize the throughput. This protected pipeline prevents a write to and a read from the same location from occurring out of order. This pipelining also enables the C28x to execute at high speeds without resorting to expensive high-speed memories. Special branch-look-ahead hardware minimizes the latency for conditional discontinuities. Special store conditional operations further improve performance.
1-7
C28x CPU
FPU Pipeline
F1 F2 D1 D2 R1 R 2 E W D R E1 E2/W
FPU Instruction
Floating-point math operations and conversions between integer and floating-point formats require 1 delay slot everything else does not require a delay slot (load, store, max, min, absolute, negative, etc.) Floating Point Unit has an unprotected pipeline
i.e. FPU can issue an instruction before previous instruction has written results
Compiler prevents pipeline conflicts Assembler detects pipeline conflicts Performance improvement by placing non-conflicting instructions in floating-point pipeline delay slots
Floating-point operations are not pipeline protected. Some instructions require delay slots for the operation to complete. This can be accomplished by insert NOPs or other non-conflicting instructions between operations. In the users guide, instructions requiring delay slots have a p after their cycle count. The 2p stands for 2 pipelined cycles. A new instruction can be started on each cycle. The result is valid only 2 instructions later. Three general guideslines for the FPU pipeline are: Math MPYF32, ADDF32, SUBF32, MACF32 I16TOF32, F32TOI16, F32TOI16R, etc Load, Store, Compare, Min, Max, Absolute and Negative value 2p cycles One delay slot 2p cycles One delay slot Single cycle No delay slot
Conversion
Everything else*
1-8
Memory
Memory
The memory space on the C28x is divided into program memory and data memory. There are several different types of memory available that can be used as both program memory and data memory. They include the flash memory, single access RAM (SARAM), OTP, off-chip memory, and Boot ROM which is factory programmed with boot software routines or standard tables used in math related algorithms.
Memory Map
The C28x CPU contains no memory, but can access memory both on and off the chip. The C28x uses 32-bit data addresses and 22-bit program addresses. This allows for a total address reach of 4G words (1 word = 16-bits) in data memory and 4M words in program memory. Memory blocks on all C28x designs are uniformly mapped to both program and data space. This memory map shows the different blocks of memory available to the program and data space.
0x010000 0x100000 0x200000 0x300000 0x33FFF8 0x340000 0x380080 0x380090 0x380400 0x380800 0x3F8000 0x3F9000 0x3FA000 0x3FB000 0x3FC000 0x3FE000 0x3FFFC0 0x3FFFFF
0x000E00 reserved PF 0 (6Kw) 0x002000 0x004000 XINTF Zone 0 (4Kw) 0x005000 PF 3 (4Kw) 0x006000 PF 1 (4Kw) reserved 0x007000 PF 2 (4Kw) 0x008000 L0 SARAM (4Kw) 0x009000 L1 SARAM (4Kw) 0x00A000 L2 SARAM (4Kw) 0x00B000 L3 SARAM (4Kw) 0x00C000 L4 SARAM (4Kw) 0x00D000 L5 SARAM (4Kw) 0x00E000 L6 SARAM (4Kw) 0x00F000 L7 SARAM (4Kw) 0x010000
reserved ADC calibration data reserved User OTP (1Kw) reserved L0 SARAM (4Kw) L1 SARAM (4Kw) L2 SARAM (4Kw) L3 SARAM (4Kw) reserved Boot ROM (8Kw)
BROM Vectors (64w)
Dual Mapped: L0, L1, L2, L3 CSM Protected: L0, L1, L2, L3, OTP FLASH, ADC CAL, Flash Regs in PF0 DMA Accessible: L4, L5, L6, L7, XINTF Zone 0, 6, 7
Data
Program
1-9
Memory
reserved OTP (1Kw) L0 SARAM (4Kw) L1 SARAM (4Kw) L2 SARAM (4Kw) L3 SARAM (4Kw)
Dual Mapped
128-bit user defined password is stored in Flash 128-bits = 2128 = 3.4 x 1038 possible passwords To try 1 password every 8 cycles at 150 MHz, it would take at least 5.8 x 1023 years to try all possible combinations!
Peripherals
The C28x comes with many built in peripherals optimized to support control applications. These peripherals vary depending on which C28x device you choose. ePWM eCAP eQEP Analog-to-Digital Converter Watchdog Timer McBSP SPI SCI I2C CAN GPIO DMA
1 - 10
PIE module For 96 interrupts INT1 to INT12 96 PIE Register Map 12 interrupts
IFR
IER
INTM
28x CPU
1 - 11
C28x Mode
C28x Mode
The C28x is one of several members of the TMS320 digital signal controller/processors family. The C28x is source code compatable with the 24x/240x devices and previously written code can be reassembled to run on a C28x device. This allows for migration of existing code onto the C28x.
1 1 0 0
0 1 0 1
Almost all uses will run in C28x Native Mode The bootloader will automatically select C28x Native Mode after reset C24x compatible mode is mostly for backwards compatibility with an older processor family
1 - 12
Reset
Reset
Reset Bootloader
Reset
OBJMODE = 0 AMODE = 0 ENPIE = 0 INTM = 1
Bootloader sets Reset vector fetched from boot ROM 0x3F FFC0
OBJMODE = 1 AMODE = 0
Note: Details of the various boot options will be discussed in the Reset and Interrupts module
1 - 13
Summary
Summary
Summary
High performance 32-bit DSP 32x32 bit or dual 16x16 bit MAC IEEE single-precision floating point unit Atomic read-modify-write instructions Fast interrupt response manager 256Kw on-chip flash memory Code security module (CSM) Control peripherals 12-bit ADC module Up to 88 shared GPIO pins Watchdog timer DMA and external memory interface Communications peripherals
1 - 14
Learning Objectives
Learning Objectives
Use Code Composer Studio to:
Create a Project Set Build Options
2-1
Module Topics
Module Topics
Programming Development Environment .............................................................................................. 2-1 Module Topics......................................................................................................................................... 2-2 Code Composer Studio ........................................................................................................................... 2-3 Software Development and COFF Concepts...................................................................................... 2-3 Projects ............................................................................................................................................... 2-5 Build Options...................................................................................................................................... 2-6 Creating a Linker Command File ........................................................................................................... 2-9 Sections .............................................................................................................................................. 2-9 Linker Command Files (.cmd) .........................................................................................................2-12 Memory-Map Description .................................................................................................................2-12 Section Placement..............................................................................................................................2-14 Exercise 2...............................................................................................................................................2-15 Summary: Linker Command File ......................................................................................................2-16 Lab 2: Linker Command File.................................................................................................................2-17 Solutions.................................................................................................................................................2-22
2-2
Editor
Libraries
Graphs, Profiling
Code Composer Studio includes a built-in editor, compiler, assembler, linker, and an automatic build process. Additionally, tools to connect file input and output, as well as built-in graph displays for output are available. Other features can be added using the plug-ins capability Numerous modules are joined to form a complete program by using the linker. The linker efficiently allocates the resources available on the device to each module in the system. The linker uses a command (.CMD) file to identify the memory resources and placement of where the various sections within each module are to go. Outputs of the linking process includes the linked object file (.OUT), which runs on the device, and can include a .MAP file which identifies where each linked section is located. The high level of modularity and portability resulting from this system simplifies the processes of verification, debug and maintenance. The process of COFF development is presented in greater detail in the following paragraphs.
2-3
The concept of COFF tools is to allow modular development of software independent of hardware concerns. An individual assembly language file is written to perform a single task and may be linked with several other tasks to achieve a more complex total system. Writing code in modular form permits code to be developed by several people working in parallel so the development cycle is shortened. Debugging and upgrading code is faster, since components of the system, rather than the entire system, is being operated upon. Also, new systems may be developed more rapidly if previously developed modules can be used in them. Code developed independently of hardware concerns increases the benefits of modularity by allowing the programmer to focus on the code and not waste time managing memory and moving code as other code components grow or shrink. A linker is invoked to allocate systems hardware to the modules desired to build a system. Changes in any or all modules, when re-linked, create a new hardware allocation, avoiding the possibility of memory resource conflicts.
Integrates: edit, code generation, and debug Single-click access using buttons Powerful graphing/profiling tools Automated tasks using GEL scripts and CCS scripting Built-in access to BIOS functions Supports TI and 3 rd party plug-ins
2-4
Projects
Code Composer works with a project paradigm. Essentially, within CCS you create a project for each executable program you wish to create. Projects store all the information required to build the executable. For example, it lists things like: the source files, the header files, the target systems memory-map, and program build options.
Project settings:
Build options (compiler, Linker, assembler, and DSP/BIOS) Build configurations
The project information is stored in a .PJT file, which is created and maintained by CCS. To create a new project, you need to select the Project:New menu item. Along with the main Project menu, you can also manage open projects using the right-click popup menu. Either of these menus allows you to Add Files to a project. Of course, you can also drag-n-drop files onto the project from Windows Explorer.
2-5
Build Options
Project options direct the code generation tools (i.e. compiler, assembler, linker) to create code according to your systems needs. When you create a new project, CCS creates two sets of build options called Configurations: one called Debug, the other Release (you might think of as Optimize). To make it easier to choose build options, CCS provides a graphical user interface (GUI) for the various compiler options. Heres a sample of the Debug configuration options.
GUI has 8 pages of categories for code generation tools Controls many aspects of the build process, such as:
Optimization level Target device Compiler/assembly/link options
There is a one-to-one relationship between the items in the text box and the GUI check and dropdown box selections. Once you have mastered the various options, you can probably find yourself just typing in the options.
2-6
.\Debug means the directory called Debug one level below the .pjt file directory $(Proj_dir)\Debug is an equivalent expression
There are many linker options but these four handle all of the basic needs. -o <filename> specifies the output (executable) filename. -m <filename> creates a map file. This file reports the linkers results. -c tells the compiler to autoinitialize your global and static variables. -x tells the compiler to exhaustively read the libraries. Without this option libraries are searched only once, and therefore backwards references may not be resolved.
2-7
Add/Remove your own custom build configurations using Project Configurations Edit a configuration:
1. 2. 3.
To help make sense of the many compiler options, TI provides two default sets of options (configurations) in each new project you create. The Release (optimized) configuration invokes the optimizer with o3 and disables source-level, symbolic debugging by omitting g (which disables some optimizations to enable debug).
2-8
Sections
Global vars (.ebss) Init values (.cinit)
int int
x = 2; y = 7;
All code consists of different parts called sections All default section names begin with . The compiler has default section names for initialized and uninitialized sections
In the TI code-generation tools (as with any toolset based on the COFF Common Object File Format), these various parts of a program are called Sections. Breaking the program code and data into various sections provides flexibility since it allows you to place code sections in ROM and variables in RAM. The preceding diagram illustrated four sections: Global Variables Initial Values for global variables Local Variables (i.e. the stack) Code (the actual instructions)
2-9
Following is a list of the sections that are created by the compiler. Along with their description, we provide the Section Name defined by the compiler.
Uninitialized Sections Name Description .ebss .stack global and static variables stack space
Note: During development initialized sections could be linked to RAM since the emulator can be used to load the RAM
Sections of a C program must be located in different memories in your target system. This is the big advantage of creating the separate sections for code, constants, and variables. In this way, they can all be linked (located) into their proper memory locations in your target embedded system. Generally, theyre located as follows: Program Code (.text) Program code consists of the sequence of instructions used to manipulate data, initialize system settings, etc. Program code must be defined upon system reset (power turn-on). Due to this basic system constraint it is usually necessary to place program code into non-volatile memory, such as FLASH or EPROM. Constants (.cinit initialized data) Initialized data are those data memory locations defined at reset.It contains constants or initial values for variables. Similar to program code, constant data is expected to be valid upon reset of the system. It is often found in FLASH or EPROM (non-volatile memory). Variables (.ebss uninitialized data) Uninitialized data memory locations can be changed and manipulated by the program code during runtime execution. Unlike program code or constants, uninitialized data or variables must reside in volatile memory, such as RAM. These memories can be modified and updated, supporting the way variables are used in math formulas, high-level languages, etc. Each variable must be declared with a directive to reserve memory to contain its value. By their nature, no value is assigned, instead they are loaded at runtime by the program
2 - 10
0x30 0000
FLASH (0x40000)
Linking code is a three step process: 1. Defining the various regions of memory (on-chip SARAM vs. FLASH vs. External Memory). 2. Describing what sections go into which memory regions 3. Running the linker with build or rebuild
2 - 11
Linking
Memory description How to place s/w into h/w
.map
Memory-Map Description
The MEMORY section describes the memory configuration of the target system to the linker. The format is: Name: origin = 0x????, length = 0x????
For example, if you placed a 64Kw FLASH starting at memory location 0x3E8000, it would read:
MEMORY { FLASH: }
Each memory segment is defined using the above format. If you added M0SARAM and M1SARAM, it would look like:
MEMORY { M0SARAM: M1SARAM: }
2 - 12
Remember that the DSP has two memory maps: Program, and Data. Therefore, the MEMORY description must describe each of these separately. The loader uses the following syntax to delineate each of these:
Linker Page
Page 0 Page 1
TI Definition
Program Data
PAGE 1: /* Data Memory */ M0SARAM: origin = 0x000000, length = 0x400 M1SARAM: origin = 0x000400, length = 0x400 }
2 - 13
Section Placement
The SECTIONS section will specify how you want the sections to be distributed through memory. The following code is used to link the sections into the memory specified in the previous example:
SECTIONS { .text:> .ebss:> .cinit:> .stack:> }
0 1 0 1
The linker will gather all the code sections from all the files being linked together. Similarly, it will combine all like sections. Beginning with the first section listed, the linker will place it into the specified memory segment.
PAGE 1: /* Data Memory */ M0SARAM: origin = 0x000000, length = 0x400 M1SARAM: origin = 0x000400, length = 0x400 } SECTIONS { .text:> .ebss:> .cinit:> .stack:> }
= = = =
0 1 0 1
2 - 14
Exercise 2
Exercise 2
Looking at the following block diagram, and create a linker command file.
Exercise 2
0x00 0000 M0SARAM (0x400) 0x00 8000 0x00 9000 0x00 0400 M1SARAM (0x400)
L0SARAM (0x1000)
L1SARAM (0x1000)
0x30 0000
FLASH (0x40000)
Generic F28x device Create the linker command file for the given memory map by filling in the blanks on the following slide
= = = =
0 1 0 1
2 - 15
Exercise 2
Sections Description
Directs software sections into named memory regions Allows per-file discrimination Allows separate load/run locations
2 - 16
F28335
System Description:
TMS320F28335 All internal RAM blocks allocated
Placement of Sections:
.text into RAM Block L0123SARAM on PAGE 0 (program memory) .cinit into RAM Block L0123SARAM on PAGE 0 (program memory) .ebss into RAM Block L4SARAM on PAGE 1 (data memory) .stack into RAM Block M1SARAM on PAGE 1 (data memory)
System Description
TMS320F28335 All internal RAM blocks allocated
Placement of Sections:
.text into RAM Block L0123SARAM on PAGE 0 (program memory) .cinit into RAM Block L0123SARAM on PAGE 0 (program memory) .ebss into RAM Block L4SARAM on PAGE 1 (data memory) .stack into RAM Block M1SARAM on PAGE 1 (data memory)
Procedure
Click: Debug
Connect
The menu bar (at the top) lists File ... Help. Note the horizontal tool bar below the menu bar and the vertical tool bar on the left-hand side. The window on the left is the project window and the large right-hand window is your workspace. 2. A project is all the files you will need to develop an executable output file (.out) which can be run on the DSP hardware. Lets create a new project for this lab. On the menu bar click: Project New
type Lab2 in the project name field and make sure the save in location is: C:\C28x\Labs\Lab2, then click Finish. This will create a .pjt file which will invoke all the necessary tools (compiler, assembler, linker) to build your project. It will also create a debug folder that will hold immediate output files. 3. Add the C file to the new project. Click: Project Add Files to Project
and make sure youre looking in C:\C28x\Labs\Lab2. Change the files of type to view C source files (*.c) and select Lab2.c and click OPEN. This will add the file Lab2.c to your newly created project. 4. Add Lab2.cmd to the project using the same procedure. This file will be edited during the lab exercise. 5. In the project window on the left, click the plus sign (+) to the left of Project. Now, click on the plus sign next to Lab2.pjt. Notice that the Lab2.cmd file is listed. Click on Source to see the current source file list (i.e. Lab2.c).
7. Select the Linker tab. Notice that .out and .map files are being created. The .out file is the executable code that will be loaded into the DSP. The .map file will contain a linker report showing memory usage and section addresses in memory. 8. Set the Stack Size to 0x200. 9. Next, setup the compiler run-time support library. In the Libraries Category, find the Include Libraries (-l) box and enter: rts2800_ml.lib. Select OK and the Build Options window will close.
2 - 18
11. Edit the Memory{} declaration by describing the system memory shown on the Lab2: Linker Command File slide in the objective section of this lab exercise. Combine the memory blocks L0SARAM, L1SARAM, L2SARM, and L3SARAM into a single memory block called L0123SARAM. Place this combined memory block into program memory on page 0. Place the other memory blocks into data memory on page 1. 12. In the Sections{} area, notice that a section called .reset has already been allocated. The .reset section is part of the rts2800_ml.lib, and is not needed. By putting the TYPE = DSECT modifier after its allocation, the linker will ignore this section and not allocate it. 13. Place the sections defined on the slide into the appropriate memories via the Sections{} area. Save your work and close the file.
15. Code Composer Studio can automatically load the output file after a successful build. On the menu bar click: Option Customize and select the Program/Project/CIO tab, check Load Program After Build. Also, Code Composer Studio can automatically connect to the target when started. Select the Debug Properties tab, check Connect to the target at startup, then click OK. 16. Click the Build button and watch the tools run in the build window. Check for errors (we have deliberately put an error in Lab2.c). When you get an error, scroll the build window at the bottom of the Code Composer Studio screen until you see the error message (in red), and simply double-click the error message. The editor will automatically open the source file containing the error, and position the mouse cursor at the correct code line. 17. Fix the error by adding a semicolon at the end of the "z = x + y" statement. For future knowlege, realize that a single code error can sometimes generate multiple error messages at build time. This was not the case here. 18. Rebuild the project (there should be no errors this time). The output file should automatically load. The Program Counter should be pointing to _c_int00 in the Disassembly Window. 19. Under Debug on the menu bar click Go Main. This will run through the Cenvironment initialization routine in the rts2800_ml.lib and stop at main() in Lab2.c.
2 - 19
Type &z into the address field and then enter. Note that you must use the ampersand (meaning "address of") when using a symbol in a memory window address box. Also note that Code Composer Studio is case sensitive. Set the properties format to Hex 16 Bit TI style at the bottom of the window. This will give you more viewable data in the window. You can change the contents of any address in the memory window by double-clicking on its value. This is useful during debug. 21. Open the watch window to view the local variables x and y. Click: View Watch Window on the menu bar.
Click the Watch Locals tab and notice that the local variables x and y are already present. The watch window will always contain the local variables for the code function currently being executed. (Note that local variables actually live on the stack. You can also view local variables in a memory window by setting the address to SP after the code function has been entered). 22. We can also add global variables to the watch window if desired. Let's add the global variable z. Click the Watch 1 tab at the bottom of the watch window. In the empty box in the Name column, type z and then enter. Note that you do not use an ampersand here. The watch window knows you are specifying a symbol. Check that the watch window and memory window both report the same value for z. Trying changing the value in one window, and notice that the value also changes in the other window.
2 - 20
2 - 21
Solutions
Solutions
Exercise 2 - Solution
MEMORY { PAGE 0: FLASH: PAGE 1: M0SARAM: M1SARAM: L0SARAM: L1SARAM: } SECTIONS { .text: .ebss: .cinit: .stack: } /* origin /* origin origin origin origin Program Memory */ = 0x300000, length Data Memory */ = 0x000000, length = 0x000400, length = 0x008000, length = 0x009000, length
= = = =
0 1 0 1
= = = = =
0 1 0 1 0, TYPE = DSECT
2 - 22
Learning Objectives
Learning Objectives
Understand the usage of the F2833x C-Code Header Files Be able to program peripheral registers Understand how the structures are mapped with the linker command file
3-1
Module Topics
Module Topics
Peripherial Registers Header Files .......................................................................................................... 3-1 Module Topics......................................................................................................................................... 3-2 Traditional and Structure Approach to C Coding .................................................................................. 3-3 Naming Conventions............................................................................................................................... 3-6 F2833x C-Code Header Files ................................................................................................................. 3-7 Peripheral Structure .h File ................................................................................................................. 3-7 Global Variable Definitions File ........................................................................................................ 3-9 Mapping Structures to Memory.........................................................................................................3-10 Linker Command File........................................................................................................................3-10 Peripheral Specific Routines..............................................................................................................3-11 Summary ................................................................................................................................................3-12
3-2
Advantages
- Simple, fast and easy to type - Variable names exactly match register names (easy to remember)
Disadvantages
- Requires individual masks to be generated to manipulate individual bits - Cannot easily display bit fields in Watch window - Will generate less efficient code in many cases
Advantages
- Easy to manipulate individual bits. - Watch window is amazing! (next slide) - Generates most efficient code (on C28x)
Disadvantages
- Can be difficult to remember the structure names (Editor Auto Complete feature to the rescue!) - More to type (again, Editor Auto Complete feature to the rescue)
3-3
3-4
C Source Code
// Stop CPU Timer0 CpuTimer0Regs.TCR.bit.TSS = 1; // Load new 32-bit period value CpuTimer0Regs.PRD.all = 0x00010000; // Start CPU Timer0 CpuTimer0Regs.TCR.bit.TSS = 0;
- Easy to read the code w/o comments - Bit mask built-in to structure
5 words, 5 cycles
You could not have coded this example any more efficiently with hand assembly!
* C28x Compiler v5.0.1 with -g and either -o1, -o2, or -o3 optimization level
- Hard to read the code w/o comments - User had to determine the bit mask
9 words, 9 cycles
* C28x Compiler v5.0.1 with -g and either -o1, -o2, or -o3 optimization level
3-5
Naming Conventions
Naming Conventions
The header files use a familiar set of naming conventions. They are consistent with the Code Composer Studio configuration tool, and generated file naming conventions
Notes: [1] PeripheralName are assigned by TI and found in the DSP2833x header files. They are a combination of capital and small letters (i.e. CpuTimer0Regs). [2] RegisterName are the same names as used in the data sheet. They are always in capital letters (i.e. TCR, TIM, TPR,..). [3] FieldName are the same names as used in the data sheet. They are always in capital letters (i.e. POL, TOG, TSS,..).
3-6
Contains everything needed to use the structure approach Defines all peripheral register bits and register addresses Header file package includes:
\DSP2833x_headers\include \DSP2833x_headers\cmd \DSP2833x_headers\gel \DSP2833x_examples \DSP2823x_examples \doc .h files linker .cmd files .gel files for CCS 2833x examples 2823x examples documentation
A peripheral is programmed by writing values to a set of registers. Sometimes, individual fields are written to as bits, or as bytes, or as entire words. Unions are used to overlap memory (register) so the contents can be accessed in different ways. The header files group all the registers belonging to a specific peripheral. A DSP2833x_Peripheral.gel GEL file can provide a pull down menu to load peripheral data structures into a watch window. Code Composer Studio can load a GEL file automatically. To include fuctions to the standard F28335.gel that is part of Code Composer Studio, add: GEL_LoadGel(base_path/gel/DSP2833x_Peripheral.gel) The GEL file can also be loaded during a Code Composer Studio session by clicking: File Load GEL
3-7
DSP2833x_Device.h
Main include file (for 2833x and 2823x devices) Will include all other .h files Include this file in each source file:
#include DSP2833x_Device.h
3-8
Declares a global instantiation of the structure for each peripheral Each structure put in its own section using a DATA_SECTION pragma to allow linking to correct memory (see next slide)
DSP2833x_GlobalVariableDefs.c
#include "DSP2833x_Device.h #pragma DATA_SECTION(AdcRegs,"AdcRegsFile"); volatile struct ADC_REGS AdcRegs;
3-9
Links each structure to the address of the peripheral using the structures named section non-BIOS and BIOS versions of the .cmd file Add one of these files to your CCS project:
DSP2833x_nonBIOS.cmd or DSP2833x_BIOS.cmd
DSP2833x_Headers_nonBIOS.cmd
MEMORY { PAGE1: ... ADC: ... }
origin=0x007100, length=0x000020
>
ADC
PAGE = 1
3 - 10
3 - 11
Summary
Summary
Go to http://www.ti.com and enter the literature number in the keyword search box
3 - 12
Learning Objectives
Learning Objectives
Describe the C28x reset process and post-reset device state List the event sequence during an interrupt Describe the C28x interrupt structure
4-1
Module Topics
Module Topics
Reset and Interrupts ................................................................................................................................. 4-1 Module Topics......................................................................................................................................... 4-2 Reset........................................................................................................................................................ 4-3 Reset - Bootloader .............................................................................................................................. 4-3 Interrupts ................................................................................................................................................ 4-5 Interrupt Processing............................................................................................................................ 4-5 Peripheral Interrupt Expansion (PIE) ................................................................................................. 4-7 PIE Interrupt Vector Table ................................................................................................................. 4-8 Interrupt Response and Latency ........................................................................................................4-10
4-2
Reset
Reset
Reset Sources
C28x Core Watchdog Timer XRS XRS pin active To XRS pin
Reset - Bootloader
Reset Bootloader
Reset
OBJMODE = 0 AMODE = 0 ENPIE = 0 INTM = 1
Bootloader sets Reset vector fetched from boot ROM 0x3F FFC0
OBJMODE = 1 AMODE = 0
4-3
Reset
Bootloader Options
GPIO pins 87 / 86 / 85 / 84 / XA15 XA14 XA13 XA12 1 1 1 1 jump to FLASH address 0x33 FFF6 1 1 1 0 bootload code to on-chip memory via SCI-A 1 1 0 1 bootload external EEPROM to on-chip memory via SPI-A 1 1 1 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 1 1 0 0 0 0 0 1 1 0 0 1 1 0 0 1 1 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 bootload external EEPROM to on-chip memory via I2C Call CAN_Boot to load from eCAN-A mailbox 1 bootload code to on-chip memory via McBSP-A jump to XINTF Zone 6 address 0x10 0000 for 16-bit data jump to XINTF Zone 6 address 0x10 0000 for 32-bit data jump to OTP address 0x38 0400 bootload code to on-chip memory via GPIO port A (parallel) bootload code to on-chip memory via XINTF (parallel) jump to M0 SARAM address 0x00 0000 branch to check boot mode branch to Flash without ADC calibration (TI debug only) branch to M0 SARAM without ADC calibration (TI debug only) branch to SCI-A without ADC calibration (TI debug only)
OTP (1Kw)
0x3F E000
0x3F F9A9
RESET
0x3F FFC0
4-4
Interrupts
Interrupts
Interrupt Sources
Internal Sources
TINT2 TINT1 TINT0 ePWM, eCAP, eQEP, ADC, SCI, SPI, I2C, eCAN, McBSP, DMA, WD PIE (Peripheral Interrupt Expansion)
C28x CORE
XRS NMI INT1 INT2 INT3 INT12 INT13 INT14
External Sources
XINT1 XINT7 TZx XRS XNMI_XINT13
Interrupt Processing
(IFR) Latch
1 0
C28x Core
INT14
A valid signal on a specific interrupt line causes the latch to display a 1 in the appropriate bit If the individual and global switches are turned on the interrupt reaches the core
4-5
Interrupts
12
INT13
11
INT12
10
INT11
9
INT10
8
INT9
RTOSINT DLOGINT
7
INT8
6
INT7
5
INT6
4
INT5
3
INT4
2
INT3
1
INT2
0
INT1
Pending : Absent :
/*** Manual setting/clearing IFR ***/ extern cregister volatile unsigned int IFR; IFR |= 0x0008; IFR &= 0xFFF7; //set INT4 in IFR //clear INT4 in IFR
Compiler generates atomic instructions (non-interruptible) for setting/clearing IFR If interrupt occurs when writing IFR, interrupt has priority IFR(bit) cleared when interrupt is acknowledged by CPU Register cleared on reset
12
INT13
11
INT12
10
INT11
9
INT10
8
INT9
RTOSINT DLOGINT
7
INT8
6
INT7
5
INT6
4
INT5
3
INT4
2
INT3
1
INT2
0
INT1
/*** Interrupt Enable Register ***/ extern cregister volatile unsigned int IER; IER |= 0x0008; IER &= 0xFFF7; //enable INT4 in IER //disable INT4 in IER
Compiler generates atomic instructions (non-interruptible) for setting/clearing IER Register cleared on reset
4-6
Interrupts
ST1
INTM
INTM modified from assembly code only: /*** Global Interrupts ***/ asm( CLRC INTM); asm( SETC INTM); //enable global interrupts //disable global interrupts
INT1.x interrupt group INT2.x interrupt group INT3.x interrupt group INT4.x interrupt group INT5.x interrupt group
INT1.8 1
INT1
96
INT6.x interrupt group INT7.x interrupt group INT8.x interrupt group INT9.x interrupt group INT10.x interrupt group INT11.x interrupt group INT12.x interrupt group
INT1 INT12
INTM
12 Interrupts
IER
IFR
28x Core
4-7
Interrupts
PIE Registers
PIEIFRx register
15 - 8 7
(x = 1 to 12)
6 5 4 3 2 1 0
reserved
PIEIERx register
15 - 8 7
(x = 1 to 12)
6 5 4 3 2 1 0
reserved
reserved
PIEACKx
15 - 1 0
PIECTRL register
PIEVECT
ENPIE
#include DSP2833x_Device.h PieCtrlRegs.PIEIFR1.bit.INTx4 = 1; PieCtrlRegs.PIEIER3.bit.INTx5 = 1; PieCtrlRegs.PIEACK.all = 0x0004; //manually set IFR for XINT1 in PIE group 1 //enable EPWM5_INT in PIE group 3 //acknowledge the PIE group 3
Memory
0x00 0D00
0x3F FFFF
4-8
Interrupts
PIE vector location 0x00 0D00 256 words in data memory RESET and INT1-INT12 vector locations are re-mapped CPU vectors are re-mapped to 0x00 0D00 in data memory
INTx.7
TINT0
INTx.6
ADCINT EPWM6 _TZINT EPWM6 _INT ECAP6 _INT
INTx.5
XINT2 EPWM5 _TZINT EPWM5 _INT ECAP5 _INT
INTx.4
XINT1 EPWM4 _TZINT EPWM4 _INT ECAP4 _INT
INTx.3
INTx.2
SEQ2INT
INTx.1
SEQ1INT EPWM1 _TZINT EPWM1 _INT ECAP1 _INT EQEP1 _INT
4-9
Interrupts
_c_int00: . . . CALL main() Initialization() { Load PIE Vectors Enable the PIE Enable PIEIER Enable Core IER Enable INTM }
main() { initialization(); . . . }
Description
14 Register words auto saved Clear corresponding IFR bit Clear corresponding IER bit Disable global ints/debug events Loads PC with int vector address Clear LOOP, EALLOW, IDLESTAT
Note: some actions occur simultaneously, none are interruptible T AH PH AR1 DP DBSTAT PC(msw) ST0 AL PL AR0 ST1 IER PC(lsw)
4 - 10
Interrupts
Interrupt Latency
Latency
ext. interrupt occurs here Internal interrupt occurs here Assumes ISR in internal RAM cycles
2
Sync ext. signal (ext. interrupt only)
3
F1/F2/D1 of ISR instruction (3 reg. pairs saved)
Recognition Get vector delay (3), SP and place alignment (1), in PC interrupt (3 reg. placed in pairs pipeline saved)
Minimum latency (to when real work occurs in the ISR): Internal interrupts: 14 cycles External interrupts: 16 cycles Maximum latency: Depends on wait states, INTM, etc.
4 - 11
Interrupts
4 - 12
System Initialization
Introduction
This module discusses the operation of the OSC/PLL-based clock module and watchdog timer. Also, the general-purpose digital I/O ports, external interrups, various low power modes and the EALLOW protected registers will be covered.
Learning Objectives
Learning Objectives
OSC/PLL Clock Module Watchdog Timer General Purpose Digital I/O External Interrupts Low Power Modes Register Protection
5-1
Module Topics
Module Topics
System Initialization.................................................................................................................................. 5-1 Module Topics......................................................................................................................................... 5-2 Oscillator/PLL Clock Module................................................................................................................. 5-3 Watchdog Timer...................................................................................................................................... 5-5 General-Purpose Digital I/O .................................................................................................................. 5-9 External Interrupts.................................................................................................................................5-12 Low Power Modes..................................................................................................................................5-13 Register Protection ................................................................................................................................5-15 Lab 5: System Initialization ...................................................................................................................5-17
5-2
Watchdog Module
CLKIN
C28x Core
XTAL OSC
MUX
X1 crystal X2
1/n
HISPCP
SYSCLKOUT
PLL
LOSPCP
HSPCLK SysCtrlRegs.PLLCR.bit.DIV ADC SysCtrlRegs.PLLSTS.bit.DIVSEL DIVSEL 0x 10 11 n /4 * /2 /1 DIV 00 0 0 00 0 1 00 1 0 00 1 1 01 0 0 01 0 1 01 1 0 01 1 1 10 0 0 10 0 1 10 1 0 CLKIN OSCCLK / n * (PLL bypass) OSCCLK x 1 / n OSCCLK x 2 / n OSCCLK x 3 / n OSCCLK x 4 / n OSCCLK x 5 / n OSCCLK x 6 / n OSCCLK x 7 / n OSCCLK x 8 / n OSCCLK x 9 / n OSCCLK x 10 / n
LSPCLK SCI, SPI, I2C, McBSP All other peripherals clocked by SYSCLKOUT
Input Clock Fail Detect Circuitry PLL will issue a limp mode clock (1-4 MHz) if input clock is removed after PLL has locked. An internal device reset will also be issued (XRSn pin not driven).
The OSC/PLL clock module provides all the necessary clocking signals for C28x devices. The PLL has a 4-bit ratio control to select different CPU clock rates. Two modes of operation are supported crystal operation, and external clock source operation. Crystal operation allows the use of an external crystal/resonator to provide the time base to the device. External clock source operation allows the internal oscillator to be bypassed, and the device clocks are generated from an external clock source input on the XCLKIN pin. The watchdog receives a clock signal from OSCCLK. The C28x core provides a SYSCLKOUT clock signal. This signal is prescaled to provide a clock source for some of the on-chip peripherals through the high-speed and low-speed peripheral clock prescalers. Other peripherals are clocked by SYSCLKOUT and use their own clock prescalers for operation.
5-3
High / Low Speed Peripheral Clock Prescaler Registers (lab file: SysCtrl.c)
SysCtrlRegs.HISPCP
15 - 3 2-0
reserved
HSPCLK ADC
2- 0
SysCtrlRegs.LOSPCP
15 - 3
reserved
Peripheral Clock Frequency SYSCLKOUT / 1 SYSCLKOUT / 2 (default HISPCP) SYSCLKOUT / 4 (default LOSPCP) SYSCLKOUT / 6 SYSCLKOUT / 8 SYSCLKOUT / 10 SYSCLKOUT / 12 SYSCLKOUT / 14
The peripheral clock control register allows individual peripheral clock signals to be enabled or disabled. If a peripheral is not being used, its clock signal could be disabled, thus reducing power consumption.
SysCtrlRegs.PCLKCR0
15 14 13
10
ECANB ENCLK
7
ECANA ENCLK
6
MA ENCLK
5
MB ENCLK
4
SCIB ENCLK
3
SCIA ENCLK
2
reserved
1
SPIA ENCLK
0
reserved
reserved
SCIC ENCLK
13
I2CA ENCLK
12
ADC ENCLK
11
TBCLK SYNC
10
reserved
reserved
SysCtrlRegs.PCLKCR1
15 14 9 8
EQEP2 ENCLK
7
EQEP1 ENCLK
6
ECAP6 ENCLK
5
ECAP5 ENCLK
4
ECAP4 ENCLK
3
ECAP3 ENCLK
2
ECAP2 ENCLK
1
ECAP1 ENCLK
0
reserved
reserved
EPWM6 ENCLK
EPWM5 ENCLK
11
EPWM4 ENCLK
10
EPWM3 ENCLK
EPWM2 ENCLK
EPWM1 ENCLK
7-0
SysCtrlRegs.PCLKCR3
15 - 14 13 12 9 8
reserved
GPIOIN ENCLK
XINTF ENCLK
5-4
Watchdog Timer
Watchdog Timer
Watchdog Timer
Resets the C28x if the CPU crashes
Watchdog counter runs independent of CPU If counter overflows, a reset or interrupt is triggered (user selectable) CPU must write correct data key sequence to reset the counter before overflow
Watchdog must be serviced or disabled within 131,072 instructions after reset This translates to 4.37 ms with a 30 MHz OSCCLK
The watchdog timer provides a safeguard against CPU crashes by automatically initiating a reset if it is not serviced by the CPU at regular intervals. In motor control applications, this helps protect the motor and drive electronics when control is lost due to a CPU lockup. Any CPU reset will revert the PWM outputs to a high-impedance state, which should turn off the power converters in a properly designed system. The watchdog timer is running immediately after system power-up/reset, and must be dealt with by software soon after. Specifically, you have 4.37ms (for a 150 MHz device) after any reset before a watchdog initiated reset will occur. This translates into 131,072 instruction cycles, which is a seemingly tremendous amount! Indeed, this is plenty of time to get the watchdog configured as desired and serviced. A failure of your software to properly handle the watchdog after reset could cause an endless cycle of watchdog initiated resets to occur.
5-5
Watchdog Timer
/512
System Reset
WDCR . 6 WDDIS
WDCNTR . 7 - 0
One-Cycle Delay
WDFLAG WDCR . 7
WDCR . 5 - 3
WDCHK 2-0
WDRST
WDINT
Good Key
1 0 1
3 3
/ /
WDENINT
Bad WDCR Key
WDKEY . 7 - 0
Remember: Watchdog starts counting immediately after reset is released! Reset Default with OSCCLK = 30 MHz computed as (1/30 MHz) * 512 * 256 = 4.37 ms
5-6
Watchdog Timer
15 - 8
5-3
2-0
reserved
WDFLAG
WDDIS
WDCHK
WDPS
reserved
WDKEY
Writing any other value has no effect Watchdog should not be serviced solely in an ISR
If main code crashes, but interrupt continues to execute, the watchdog will not catch the crash Could put the 55h WDKEY in the main code, and the AAh WDKEY in an ISR; this catches main code crashes and also ISR crashes
5-7
Watchdog Timer
15 - 3
reserved
WD Enable Interrupt
0 = WD generates a DSP reset 1 = WD generates a WDINT interrupt
5-8
Input Qual
GPIO Port A
Input
GPIO Port B Direction Register (GPBDIR) [GPIO 32 to 63]
GPIO Port B
Qual
GPIO Port C
10 11 00
01
GPxPUD Internal Pull-Up 0 = enable (default GPIO 12-31) 1 = disable (default GPIO 0-11)
Input Qualification
(GPIO 0-63 only)
Pin
* See device datasheet for pin function selection matrices
5-9
pin
Qualification available on ports A & B (GPIO 0 - 63) only Individually selectable per pin samples taken
no qualification (peripherals only) sync to SYCLKOUT only qualify 3 samples qualify 6 samples
T = qual period
00 = 01 = 10 = 11 =
sync to SYSCLKOUT only * qual to 3 samples qual to 6 samples no sync or qual (for peripheral only; GPIO same as 00)
GPACTRL / GPBCTRL
31 24 16 8 0
QUALPRD3
QUALPRD2
QUALPRD1
QUALPRD0
B: A:
GPIO55-48 GPIO23-16
GPIO47-40 GPIO15-8
GPIO39-32 GPIO7-0
5 - 10
5 - 11
External Interrupts
External Interrupts
External Interrupts
8 external interrupt signals: XNMI, XINT1-7 The signals can be mapped to a variety of pins
XNMI, XINT1-2 can be mapped to any of GPIO0-31 XINT3-7 can be mapped to any of GPIO32-63
The eCAP pins and their interrupts can be used as additional external interrupts if needed XNMI, XINT1, and XINT2 also each have a freerunning 16-bit counter that measures the elapsed time between interrupts
The counter resets to zero each time the interrupt occurs
Pin Selection Register chooses which pin(s) the signal comes out on Configuration Register controls the enable/disable and polarity Counter Register holds the interrupt counter
5 - 12
CPU Logic Peripheral Watchdog Clock Clock Logic Clock on off off off on on off off on on on off
. . .
. . .
. . .
14 - 8
7-2
1-0
WDINTE
reserved
QUALSTDBY
LPM0
Low Power Mode Entering 1. Set LPM bits 2. Enable desired exit interrupt(s) 3. Execute IDLE instruction 4. The power down sequence of the hardware depends on LP mode
* QUALSTDBY will qualify the GPIO wakeup signal in series with the GPIO port qualification. This is useful when GPIO port qualification is not available or insufficient for wake-up purposes.
5 - 13
RESET or XNMI
Watchdog Interrupt
yes yes no
yes no no
GPIO8
0
GPIO7
GPIO6
GPIO5
GPIO4
GPIO3
GPIO2
GPIO1
GPIO0
Wake device from HALT and STANDBY mode (GPIO Port A) 0 = disable (default) 1 = enable
5 - 14
Register Protection
Register Protection
DevEmuRegs.PROTSTART & DevEmuRegs.PROTRANGE
Write-Read Protection
Suppose you need to write to a peripheral register and then read a different register for the same peripheral (e.g., write to control, read from status register)? CPU pipeline protects W-R order for the same address Write-Read protection mechanism protects W-R order for different addresses
Configured by PROTSTART and PROTRANGE registers Default values for these registers protect the address range 0x4000 to 0x7FFF Default values typically sufficient
M0SARAM M1SARAM PIE Vectors 0x00 0000 0x00 0400 0x00 0800 0x00 0D00 0x00 0E00
PF 0 0x00 2000 reserved 0x00 4000 XINTF Zone 0 0x00 5000 PF 3 0x00 6000 PF 1 0x00 7000 PF 2 0x00 8000
Note: PF0 is not protected by default because the flexibility of PROTSTART and PROTRANGE are such that M0 and M1 SARAM blocks would also need to be protected, thereby reducing the performance of this RAM. See TMS320x2833x, 2823x System Control and Interrupts Reference Guide, #SPRUFB0
EALLOW Protection (1 of 2)
EALLOW stands for Emulation Allow Code access to protected registers allowed only when EALLOW = 1 in the ST1 register The emulator can always access protected registers EALLOW bit controlled by assembly level instructions
EALLOW sets the bit (register access enabled) EDIS clears the bit (register access disabled)
5 - 15
Register Protection
EALLOW Protection (2 of 2)
The following registers are protected:
Device Emulation Flash Code Security Module PIE Vector Table DMA (most registers) eCANA/B (control registers only; mailbox RAM not protected) ePWM1 - 6 (some registers) GPIO (control registers only) System Control
See device datasheet and peripheral users guides for detailed listings
5 - 16
Setup the clock module PLL, HISPCP = /1, LOSPCP = /4, low-power modes to default values, enable all module clocks Disable the watchdog clear WD flag, disable watchdog, WD prescale = 1 Setup watchdog system and control register DO NOT clear WD OVERRIDE bit, WD generate a DSP reset Setup shared I/O pins set all GPIO pins to GPIO function (e.g. a "00" setting for GPIO function, and a 01, 10, or 11 setting for a peripheral function.)
The first part of the lab exercise will setup the system initialization and test the watchdog operation by having the watchdog cause a reset. In the second part of the lab exercise the PIE vectors will be added and tested by using the watchdog to generate an interrupt. This lab will make use of the DSP2833x C-code header files to simplify the programming of the device, as well as take care of the register definitions and addresses. Please review these files, and make use of them in the future, as needed.
Procedure
Note that include files, such as DSP2833x_Device.h and Lab.h, are automatically added at project build time. (Also, DSP2833x_DefaultIsr.h is automatically added and will be used with the interrupts in the second part of this lab exercise).
5 - 17
Select the Compiler tab. In the Preprocessor Category, find the Include Search Path (-i) box and enter: ..\DSP2833x_headers\include This is the path for the header files. 3. Select the Linker tab and set the Stack Size to 0x200. 4. Setup the compiler run-time support library. In the Libraries Category, find the Include Libraries (-l) box and enter: rts2800_ml.lib. Select OK and the Build Options window will close.
5 - 18
5 - 19
19. Using the PIE Interrupt Assignment Table shown in the previous module find the location for the watchdog interrupt, WAKEINT. This will be used in the next step. PIE group #: 20. Modify main() to do the following: - Enable global interrupts (INTM bit) Then modify InitWatchdog() to do the following: - Enable the "WAKEINT" interrupt in the PIE (Hint: use the PieCtrlRegs structure) - Enable the appropriate core interrupt in the IER register 21. In Watchdog.c modify the system control and status register (SCSR) to cause the watchdog to generate a WAKEINT rather than a reset. 22. Open and inspect DefaultIsr_5.c. This file contains interrupt service routines. The ISR for WAKEINT has been trapped by an emulation breakpoint contained in an inline assembly statement using ESTOP0. This gives the same results as placing a breakpoint in the ISR. We will run the lab exercise as before, except this time the watchdog will generate an interrupt. If the registers have been configured properly, the code will be trapped in the ISR. 23. Open and inspect PieCtrl_5_6_7_8_9_10.c. This file is used to initialize the PIE RAM and enable the PIE. The interrupt vector table located in PieVect_5_6_7_8_9_10.c is copied to the PIE RAM to setup the vectors for the interrupts. Close the inspected files. # within group:
5 - 20
Analog-to-Digital Converter
Introduction
This module explains the operation of the analog-to-digital converter. The system consists of a 12-bit analog-to-digital converter with 16 analog input channels. The analog input channels have a range from 0 to 3 volts. Two input analog multiplexers are used, each supporting 8 analog input channels. Each multiplexer has its own dedicated sample and hold circuit. Therefore, sequential, as well as simultaneous sampling is supported. Also, the ADC system features programmable auto sequence conversions with 16 results registers. Start of conversion (SOC) can be performed by an external trigger, software, or an ePWM event.
Learning Objectives
Learning Objectives
Understand the operation of the Analog-to-Digital converter (ADC) Use the ADC to perform data acquisition
6-1
Module Topics
Module Topics
Analog-to-Digital Converter..................................................................................................................... 6-1 Module Topics......................................................................................................................................... 6-2 Analog-to-Digital Converter................................................................................................................... 6-3 Analog-to-Digital Converter Registers............................................................................................... 6-5 Example Sequencer Start/Stop Operation ...................................................................................6-10 ADC Conversion Result Buffer Register...........................................................................................6-11 Signed Input Voltages .......................................................................................................................6-11 ADC Calibration................................................................................................................................6-12 Lab 6: Analog-to-Digital Converter ......................................................................................................6-14
6-2
Analog-to-Digital Converter
Analog-to-Digital Converter
Result MUX
MUX B
S/H B
MUX
RESULT15
SEQ1
Autosequencer
MAX_CONV1 Ch Sel (CONV00) Ch Sel (CONV01) Ch Sel (CONV02) Ch Sel (CONV03)
Ch Sel (CONV15)
Start Sequence Trigger
MUX A
S/H A
RESULT0 RESULT1
MUX
MUX
Result MUX
RESULT7 RESULT8 RESULT9
MUX B
S/H B
MUX
SEQ1
SEQ2
Autosequencer
MAX_CONV1 Ch Sel (CONV00) Ch Sel (CONV01)
Autosequencer
MAX_CONV2 Ch Sel (CONV08) Ch Sel (CONV09)
Result MUX
RESULT15
Ch Sel (CONV07)
Start Sequence Trigger
Ch Sel (CONV15)
Start Sequence Trigger
Software ePWM_SOC_B
6-3
Analog-to-Digital Converter
* Note that using Continuous Run mode with Dual Sequencer generally doesnt make sense since sequencer #2 will not get to do any conversions!
1010b (x10)
PCLKCR0.ADCENCLK = 1
FCLK = HSPCLK/(2*ADCCLKPS)
sampling window = (ACQ_PS + 1)*(1/ADCCLK) Note: Maximum F2833x ADCCLK is 25 MHz, but INL (integral nonlinearity error) is greater above 12.5 MHz. See the device datasheet for more information.
6-4
Analog-to-Digital Converter
ADC Conversion Result Buffer Register 14 ADC Conversion Result Buffer Register 15 ADC Reference Select Register ADC Offset Trim Register ADC Status and Flag Register
6-5
Analog-to-Digital Converter
Upper Register:
ADC Module Reset
0 = no effect 1 = reset (set back to 0 by ADC logic)
15 reserved
14 RESET
13 - 12 SUSMOD
11 - 8 ACQ_PS
7 CPS
Conversion Prescale
0: ADCCLK = FCLK / 1 1: ADCCLK = FCLK / 2
Lower Register:
Continuous Run
0 = stops after reaching end of sequence 1 = continuous (starts all over again from initial state) 6 CONT_RUN 5 SEQ_OVRD
Sequencer Mode
0 = dual mode 1 = cascaded mode
4 SEQ_CASC
3-0 reserved
6-6
Analog-to-Digital Converter
Upper Register:
ePWM SOC B
(cascaded mode only)
13
12
11
ePWM_SOCB RST_SEQ1 SOC_SEQ1 reserved INT_ENA INT_MOD reserved ePWM_SOCA _SEQ _SEQ1 _SEQ1 _SEQ1
Reset SEQ1
0 = no action 1 = immediate reset SEQ1 to initial state
Lower Register:
Start Conversion (SEQ2)
(dual-sequencer mode only)
RST_SEQ2 SOC_SEQ2 reserved INT_ENA INT_MOD reserved ePWM_SOCB _SEQ2 _SEQ2 _SEQ2
Reset SEQ2
0 = no action 1 = immediate reset SEQ2 to initial state
6-7
Analog-to-Digital Converter
15 - 8 reserved
7-6
4- 1 ADCCLKPS
0 SMODE_SEL
ADCBGRFDN ADCPWDN
4
MAX_ CONV 2_0
3
MAX_ CONV 1_3
2
MAX_ CONV 1_2
1
MAX_ CONV 1_1
0
MAX_ CONV 1_0
SEQ2
Dual Mode
SEQ1
Each sequencer starts at the initial state and advances sequentially Each will wrap at the end state unless software resets it sooner SEQ1 CONV00 CONV07 SEQ2 CONV08 CONV15 Cascaded CONV00 CONV15
6-8
Analog-to-Digital Converter
CONV02 CONV01 CONV00 CONV06 CONV05 CONV04 CONV10 CONV09 CONV08 CONV14 CONV13 CONV12
For purposes of these registers, channel numbers are: 0 = ADCINA0 8 = ADCINB0 7 = ADCINA7 15 = ADCINB7
6-9
Analog-to-Digital Converter
Bits
15-12 11-8
7-4
3-0
I1 x
V3 x
V2 I3
V1 I2
ADCCHSELSEQ1 ADCCHSELSEQ2
Once reset and initialized, SEQ1 waits for a trigger First trigger, three conversions performed: CONV00 (V1), CONV01 (V2), CONV02 (V3) MAX_CONV1 value is reset to 2 (unless changed by software) SEQ1 waits for second trigger Second trigger, three conversions performed: CONV03 (I1), CONV04 (I2), CONV05 (I3) End of second sequence, ADC Results registers have the following values: RESULT0 RESULT1 RESULT2 RESULT3 RESULT4 RESULT5 V1 V2 V3 I1 I2 I3
SEQ1 waits at current state for another trigger User can reset SEQ1 by software to state CONV00 and repeat same trigger 1,2 session
6 - 10
Analog-to-Digital Converter
(2 wait-state read)
LSB 4 3 2 1 0
AdcMirror.ADCRESULTx, x = 0 - 15
MSB 15 14 13 12 11 10 9 8 7
(1 wait-state read)
LSB 6 5 4 3 2 1 0
Vin 1.5V
R + R
C28x
ADCINx
ADCLO GND
6 - 11
Analog-to-Digital Converter
ADC Calibration
CH
Gain error
Compensated in software Some loss in full-scale range Requires use of a second ADC input pin and an upper-range reference voltage on that pin; see TMS320280x and TMS320F2801x ADC Calibration appnote #SPRAAD8 for more information
Tip: To minimize mux-to-mux variation effects, put your most critical signals on a single mux and use that mux for calibration inputs
* +/-15 LSB offset, +/-30 LSB gain. See device datasheet for exact specifications
6 - 12
MUX
Analog-to-Digital Converter
The F28335 ADC has an internal reference with temperature stability of ~50 PPM/C * If this is not sufficient for your application, there is the option to use an external reference *
External reference choices: 2.048 V, 1.5 V, 1.024 V The reference value DOES NOT change the 0 - 3 V full-scale range of the ADC
6 - 13
ADC
connector wire
ADCINA0
ePWM2 triggering ADC on period match using SOC A trigger every 20.833 s (48 kHz)
ePWM2
Recall that there are three basic ways to initiate an ADC start of conversion (SOC):
1. Using software a. SOC_SEQ1/SOC_SEQ2 bit in ADCTRL2 causes an SOC upon completion of the current conversion (if the ADC is currently idle, an SOC occurs immediately) Automatically triggered on user selectable ePWM conditions a. ePWM underflow (CTR = 0) b. ePWM period match (CTR = PRD) c. ePWM compare match (CTRU/D = CMPA/B) Externally triggered using a pin a. ADCSOC pin
2.
3.
One or more of these methods may be applicable to a particular application. In this lab, we will be using the ADC for data acquisition. Therefore, one of the ePWMs (ePWM2) will be configured to automatically trigger the SOC A signal at the desired sampling rate (SOC method 2b above). The ADC end-of-conversion interrupt will be used to prompt the CPU to copy the results of the ADC conversion into a results buffer in memory. This buffer pointer will be managed in a circular fashion, such that new conversion results will continuously overwrite older conversion results in the buffer. In order to generate an interesting input signal, the code also alternately toggles a GPIO pin (GPIO21) high and low in the ADC interrupt service routine. The
6 - 14
pointer rewind
RESULT0
data memory
...
ADC ISR will also toggle LED DS2 on the eZdsp as a visual indication that the ISR is running. This pin will be connected to the ADC input pin, and sampled. After taking some data, Code Composer Studio will be used to plot the results. A flow chart of the code is shown in the following slide.
ADC ISR read the ADC result write to result buffer adjust the buffer pointer toggle the GPIO pin return from interrupt
return
Notes
Program performs conversion on ADC channel A0 (ADCINA0 pin) ADC conversion is set at a 48 kHz sampling rate ePWM2 is triggering the ADC on period match using SOC A trigger Data is continuously stored in a circular buffer GPIO21 pin is also toggled in the ADC ISR ADC ISR will also toggle the eZdsp LED DS2 as a visual indication that it is running
Procedure
Project File
1. A project named Lab6.pjt has been created for this lab. Open the project by clicking Open and look in C:\C28x\Labs\Lab6. All Build Options on Project have been configured the same as the previous lab. The files used in this lab are:
6 - 15
PIE group #: This information will be used in the next step. 5. Modify the end of Adc.c to do the following:
# within group:
- Enable the "ADCINT" interrupt in the PIE (Hint: use the PieCtrlRegs structure) - Enable the appropriate core interrupt in the IER register 6. Open and inspect DefaultIsr_6.c. This file contains the ADC interrupt service routine.
6 - 16
Note: Exercise care when connecting any wires, as the power to the eZdsp is on, and we do not want to damage the eZdsp! Details of pin assignments can be found in Appendix A. 11. Using a connector wire provided, connect the ADCINA0 (pin # P9-2) to GND (pin # P9-1) on the eZdsp. Then run the code again, and halt it after a few seconds. Verify that the ADC results buffer contains the expected value of 0x0000. 12. Adjust the connector wire to connect the ADCINA0 (pin # P9-2) to +3.3V (pin # P4-7) on the eZdsp. (Note: pin # P4-7 / GPIO20 has been set to 1 in Gpio.c). Then run the code again, and halt it after a few seconds. Verify that the ADC results buffer contains the expected value of 0x0FFF. 13. Adjust the connector wire to connect the ADCINA0 (pin # P9-2) to GPIO21 (pin # P4-8) on the eZdsp. Then run the code again, and halt it after a few seconds. Examine the contents of the ADC results buffer (the contents should be alternating 0x0000 and 0x0FFF values). Are the contents what you expected? 14. Open and setup a graph to plot a 48-point window of the ADC results buffer. Click: View Graph Time/Frequency and set the following values: Start Address Acquisition Buffer Size Display Data Size DSP Data Type Sampling Rate (Hz) Time Display Unit Select OK to save the graph options. 15. Recall that the code toggled the GPIO21 pin alternately high and low. (Also, the ADC ISR is toggling the LED DS2 on the eZdsp as a visual indication that the ISR is running). If you had an oscilloscope available to display GPIO21, you would expect to see a square-wave. Why does Code Composer Studio plot resemble a triangle wave? What is the signal processing term for what is happening here? 16. Recall that the program toggled the GPIO21 pin at a 48 kHz rate. Therefore, a complete cycle (toggle high, then toggle low) occurs at half this rate, or 24 kHz. We therefore expect the period of the waveform to be 41.667 s. Confirm this by measuring the period of the triangle wave using the graph (you may want to enlarge the graph window using the mouse). The measurement is best done with the mouse. The lower left-hand corner of the graph window will display the X and Y axis values. Subtract the X-axis values taken over a complete waveform period. AdcBuf 48 48 16-bit unsigned integer 48000 s
6 - 17
18. A message box may appear. Select YES to enable debug events. This will set bit 1 (DBGM bit) of status register 1 (ST1) to a 0. The DBGM is the debug enable mask bit. When the DBGM bit is set to 0, memory and register values can be passed to the host processor for updating the debugger windows. 19. The memory and graph windows displaying AdcBuf should still be open. The connector wire between ADCINA0 (pin # P9-2) and GPIO21 (pin # P4-8) should still be connected. In real-time mode, we would like to have our window continuously refresh. Click: View Real-time Refresh Options
and check Global Continuous Refresh. Use the default refresh rate of 100 ms and select OK. Alternately, we could have right clicked on each window individually and selected Continuous Refresh. Note: Global Continuous Refresh causes all open windows to refresh at the refresh rate. This can be problematic when a large number of windows are open, as bandwidth over the emulation link is limited. Updating too many windows can cause the refresh frequency to bog down. In that case, either close some windows, or disable global refresh and selectively enable Continuous Refresh for individual windows of interest instead. 20. Run the code and watch the windows update in real-time mode. Are the values updating as expected? 21. Fully halting the DSP when in real-time mode is a two-step process. First, halt the processor with Debug Halt. Then uncheck the Real-time mode to take the DSP out of real-time mode (Debug Real-time Mode).
6 - 18
22. So far, we have seen data flowing from the DSP to the debugger in realtime. In this step, we will flow data from the debugger to the DSP. Open and inspect DefaultIsr_6.c. Notice that the global variable DEBUG_TOGGLE is used to control the toggling of the GPIO21 pin. This is the pin being read with the ADC. Highlight DEBUG_TOGGLE with the mouse, right click and select Add to Watch Window. The global variable DEBUG_TOGGLE should now be in the watch window with a value of 1. Run the code in real-time mode and change the value to 0. Are the results shown in the memory and graph window as expected? Change the value back to 1. As you can see, we are modifying data memory contents while the processor is running in real-time (i.e., we are not halting the DSP nor interfering with its operation in any way)! When done, fully halt the DSP.
23. Code Composer Studio includes GEL (General Extension Language) functions which automate entering and exiting real-time mode. Four functions are available: Run_Realtime_with_Reset (reset DSP, enter real-time mode, run DSP) Run_Realtime_with_Restart (restart DSP, enter real-time mode, run DSP) Full_Halt (exit real-time mode, halt DSP) Full_Halt_with_Reset (exit real-time mode, halt DSP, reset DSP) Realtime Emulation Control GEL Function
These GEL functions can be executed by clicking: GEL In the remaining lab exercises we will be using the above GEL functions to run and halt the code in real-time mode. If you would like, try repeating the previous step using the following GEL functions: GEL GEL Realtime Emulation Control Realtime Emulation Control End of Exercise Run_Realtime_with_Reset Full_Halt
6 - 19
6 - 20
Control Peripherals
Introduction
This module explains how to generate PWM waveforms using the ePWM unit. Also, the eCAP unit, and eQEP unit will be discussed.
Learning Objectives
Learning Objectives
Pulse Width Modulation (PWM) review Generate a PWM waveform with the Pulse Width Modulator Module (ePWM) Use the Capture Module (eCAP) to measure the width of a waveform Explain the function of Quadrature Encoder Pulse Module (eQEP)
Note: Different numbers of ePWM, eCAP, and eQEP modules are available on F2833x and F2823x devices. See the device datasheet for more information.
7-1
Module Topics
Module Topics
Control Peripherals................................................................................................................................... 7-1 Module Topics......................................................................................................................................... 7-2 PWM Review........................................................................................................................................... 7-3 ePWM...................................................................................................................................................... 7-5 ePWM Time-Base Module ................................................................................................................. 7-5 ePWM Compare Module.................................................................................................................... 7-9 ePWM Action Qualifier Module .......................................................................................................7-11 Asymmetric and Symmetric Waveform Generation using the ePWM..............................................7-16 PWM Computation Example.............................................................................................................7-17 ePWM Dead-Band Module ...............................................................................................................7-18 ePWM PWM Chopper Module .........................................................................................................7-21 ePWM Trip-Zone Module .................................................................................................................7-24 ePWM Event-Trigger Module...........................................................................................................7-27 Hi-Resolution PWM (HRPWM) .......................................................................................................7-29 eCAP ......................................................................................................................................................7-30 eQEP......................................................................................................................................................7-36 Lab 7: Control Peripherals....................................................................................................................7-38
7-2
PWM Review
PWM Review
t T
Original Signal
PWM representation
Pulse width modulation (PWM) is a method for representing an analog signal with a digital approximation. The PWM signal consists of a sequence of variable width, constant amplitude pulses which contain the same total energy as the original analog signal. This property is valuable in digital motor control as sinusoidal current (energy) can be delivered to the motor using PWM signals applied to the power converter. Although energy is input to the motor in discrete packets, the mechanical inertia of the rotor acts as a smoothing filter. Dynamic motor motion is therefore similar to having applied the sinusoidal currents directly.
7-3
PWM Review
?
Desired signal to system
7-4
ePWM
ePWM
Clock Prescaler
TBCTR . 15 - 0
Compare Register
Compare Register
AQCTLA . 11 - 0 AQCTLB . 11 - 0 DBCTL . 4 - 0
TBCLK
Compare Logic
Action Qualifier
Dead Band
EPWMxSYNCI
Period Register
Shadowed TBPRD . 15 - 0
EPWMxA
PWM Chopper
PCCTL . 10 - 0
Trip Zone
EPWMxB TZy TZSEL . 15 - 0
SYSCLKOUT
Clock Prescaler
TBCTR . 15 - 0
Compare Register
Compare Register
AQCTLA . 11 - 0 AQCTLB . 11 - 0 DBCTL . 4 - 0
TBCLK
Compare Logic
Action Qualifier
Dead Band
EPWMxSYNCI
Period Register
Shadowed TBPRD . 15 - 0
EPWMxA
PWM Chopper
PCCTL . 10 - 0
Trip Zone
EPWMxB TZy TZSEL . 15 - 0
SYSCLKOUT
7-5
ePWM
o o
. o . o .
En
o o
En
o o
o CTR=zero o CTR=CMPB o o o X
SyncOut
7-6
ePWM
Description Time-Base Control Time-Base Status Time-Base Phase Time-Base Counter Time-Base Period
Upper Register:
Phase Direction 0 = count down after sync 1 = count up after sync
15 - 14 FREE_SOFT 13 PHSDIR
Emulation Halt Behavior 00 = stop after next CTR inc/dec 01 = stop when: Up Mode; CTR = PRD Down Mode; CTR = 0 Up/Down Mode; CTR = 0 1x = free run (do not stop)
TB Clock Prescale 000 = /1 (default) 001 = /2 010 = /4 011 = /8 100 = /16 101 = /32 110 = /64 111 = /128
High Speed TB Clock Prescale 000 = /1 001 = /2 (default) 010 = /4 011 = /6 100 = /8 101 = /10 110 = /12 111 = /14
7-7
ePWM
Lower Register:
Software Force Sync Pulse 0 = no action 1 = force one-time sync
6 SWFSYNC 5-4 SYNCOSEL 3
Counter Mode 00 = count up 01 = count down 10 = count up and down 11 = stop freeze (default)
2 PHSEN 1- 0 CTRMODE
PRDLD
00 01 10 11
Counter Max Latched 0 = max value not reached 1 = CTR = 0xFFFF (write 1 to clear)
15 - 3 reserved
2 CTRMAX
1 SYNCI
0 CTRDIR
External Input Sync Latched 0 = no sync event occurred 1 = sync has occurred (write 1 to clear)
7-8
ePWM
Clock Prescaler
TBCTR . 15 - 0
Compare Register
Compare Register
AQCTLA . 11 - 0 AQCTLB . 11 - 0 DBCTL . 4 - 0
TBCLK
Compare Logic
Action Qualifier
Dead Band
EPWMxSYNCI
Period Register
Shadowed TBPRD . 15 - 0
EPWMxA
PWM Chopper
PCCTL . 10 - 0
Trip Zone
EPWMxB TZy TZSEL . 15 - 0
SYSCLKOUT
..
..
..
Asymmetrical Waveform
.. ..
Count Down Mode
..
Asymmetrical Waveform
.. .. .. ..
Count Up and Down Mode
Symmetrical Waveform
7-9
ePWM
6 SHDWBMODE
5 reserved
4 SHDWAMODE
3- 2 LOADBMODE
1-0 LOADAMODE
CMPA and CMPB Operating Mode 0 = shadow mode; double buffer w/ shadow register 1 = immediate mode; shadow register not used
CMPA and CMPB Shadow Load Mode 00 = load on CTR = 0 01 = load on CTR = PRD 10 = load on CTR = 0 or PRD 11 = freeze (no load possible)
7 - 10
ePWM
Clock Prescaler
TBCTR . 15 - 0
Compare Register
Compare Register
AQCTLA . 11 - 0 AQCTLB . 11 - 0 DBCTL . 4 - 0
TBCLK
Compare Logic
Action Qualifier
Dead Band
EPWMxSYNCI
Period Register
Shadowed TBPRD . 15 - 0
EPWMxA
PWM Chopper
PCCTL . 10 - 0
Trip Zone
EPWMxB TZy TZSEL . 15 - 0
SYSCLKOUT
SW X SW SW SW T
Z X Z Z Z T
CA X CA CA CA T
CB X CB CB CB T
P X P P P T
Clear Low
Set High
Toggle
7 - 11
ePWM
.
Z P X CB X
.
CA Z P X
.
CB X
.
CA Z P X
EPWMA Z EPWMB P X CB CA X Z P X CB CA X Z P X
.
CA
.
CB
.
CA
.
CB
EPWMA Z T EPWMB Z T Z T
7 - 12
ePWM
. . . .
CA CA
. . . .
CA CA
EPWMA CB EPWMB CB CB CB
.
CA
.
CB
.
CA
.
CB
EPWMA Z EPWMB P Z P
7 - 13
ePWM
Description AQ Control Output A AQ Control Output B AQ S/W Force AQ Cont. S/W Force
15 - 12 reserved
11 - 10 CBD
9-8 CBU
7-6 CAD
5-4 CAU
3- 2 PRD
1-0 ZRO
= do nothing (action disabled) = clear (low) = set (high) = toggle (low high; high low)
7 - 14
ePWM
15 - 8 reserved
7-6 RLDCSF
5 OTSFB
4-3 ACTSFB
2 OTSFA
1-0 ACTSFA
AQSFRC Shadow Reload Options 00 = load on event CTR = 0 01 = load on event CTR = PRD 10 = load on event CTR = 0 or CTR = PRD 11 = load immediately (from active reg.)
Action on One-Time S/W Force B / A 00 = do nothing (action disabled) 01 = clear (low) 10 = set (high) 11 = toggle (low high; high low)
15 - 4 reserved
3-2 CSFB
1-0 CSFA
Continuous S/W Force on Output B / A 00 = forcing disabled 01 = force continuous low on output 10 = force continuous high on output 11 = forcing disabled
7 - 15
ePWM
Asymmetric and Symmetric Waveform Generation using the ePWM PWM switching frequency:
The PWM carrier frequency is determined by the value contained in the time-base period register, and the frequency of the clocking signal. The value needed in the period register is: Asymmetric PWM:
switching period period register = timer period 1 period register = switching period 2(timer period)
Symmetric PWM:
Notice that in the symmetric case, the period value is half that of the asymmetric case. This is because for up/down counting, the actual timer period is twice that specified in the period register (i.e. the timer counts up to the period register value, and then counts back down).
PWM resolution:
The PWM compare function resolution can be computed once the period register value is determined. The largest power of 2 is determined that is less than (or close to) the period value. As an example, if asymmetric was 1000, and symmetric was 500, then: Asymmetric PWM: approx. 10 bit resolution since 210 = 1024 1000 Symmetric PWM: approx. 9 bit resolution since 29 = 512 500
7 - 16
ePWM
.
CA
.
CA
f TBCLK = 150 MHz
PWM Pin
TBPRD =
.
P
CA
fTBCLK = 150 MHz
PWM Pin
TBPRD =
7 - 17
ePWM
Clock Prescaler
TBCTR . 15 - 0
Compare Register
Compare Register
AQCTLA . 11 - 0 AQCTLB . 11 - 0 DBCTL . 4 - 0
TBCLK
Compare Logic
Action Qualifier
Dead Band
EPWMxSYNCI
Period Register
Shadowed TBPRD . 15 - 0
EPWMxA
PWM Chopper
PCCTL . 10 - 0
Trip Zone
EPWMxB TZy TZSEL . 15 - 0
SYSCLKOUT
Transistor gates turn on faster than they shut off Short circuit if both gates are on at same time!
7 - 18
ePWM
Dead-band control provides a convenient means of combating current shoot-through problems in a power converter. Shoot-through occurs when both the upper and lower gates in the same phase of a power converter are open simultaneously. This condition shorts the power supply and results in a large current draw. Shoot-through problems occur because transistors open faster than they close, and because high-side and low-side power converter gates are typically switched in a complimentary fashion. Although the duration of the shoot-through current path is finite during PWM cycling, (i.e. the closing gate will eventually shut), even brief periods of a short circuit condition can produce excessive heating and over stress in the power converter and power supply.
. .
PWMxB
1
S4
. .
0 S2 RED
S1
PWMxA
1
S5 IN-MODE
(10-bit counter)
S3 1
POLSEL
FED
S0 0
OUT-MODE
PWMxB
Two basic approaches exist for controlling shoot-through: modify the transistors, or modify the PWM gate signals controlling the transistors. In the first case, the opening time of the transistor gate must be increased so that it (slightly) exceeds the closing time. One way to accomplish this is by adding a cluster of passive components such as resistors and diodes in series with the transistor gate, as shown in the next figure.
Shoot-through control via power circuit modification The resistor acts to limit the current rise rate towards the gate during transistor opening, thus increasing the opening time. When closing the transistor however, current flows unimpeded from the gate via the by-pass diode and closing time is therefore not affected. While this passive approach offers an inexpensive solution that is independent of the control microprocessor, it is
7 - 19
ePWM
imprecise, the component parameters must be individually tailored to the power converter, and it cannot adapt to changing system conditions. The second approach to shoot-through control separates transitions on complimentary PWM signals with a fixed period of time. This is called dead-band. While it is possible to perform software implementation of dead-band, the C28x offers on-chip hardware for this purpose that requires no additional CPU overhead. Compared to the passive approach, dead-band offers more precise control of gate timing requirements. In addition, the dead time is typically specified with a single program variable that is easily changed for different power converters or adapted on-line.
Description Dead-Band Control 10-bit Rising Edge Delay 10-bit Falling Edge Delay
Rising Edge Delay = T TBCLK x DBRED Falling Edge Delay = TTBCLK x DBFED
7 - 20
ePWM
In-Mode Control 00 = PWMxA is source for RED and FED 01 = PWMxA is source for FED PWMxB is source for RED 10 = PWMxA is source for RED PWMxB is source for FED 11 = PWMxB is source for RED and FED
Out-Mode Control 00 = disabled (DBM bypass) 01 = PWMxA = no delay PWMxB = FED 10 = PWMxA = RED PWMxB = no delay 11 = RED & FED (DBM fully enabled)
Clock Prescaler
TBCTR . 15 - 0
Compare Register
Compare Register
AQCTLA . 11 - 0 AQCTLB . 11 - 0 DBCTL . 4 - 0
TBCLK
Compare Logic
Action Qualifier
Dead Band
EPWMxSYNCI
Period Register
Shadowed TBPRD . 15 - 0
EPWMxA
PWM Chopper
PCCTL . 10 - 0
Trip Zone
EPWMxB TZy TZSEL . 15 - 0
SYSCLKOUT
7 - 21
ePWM
EPWMxA
EPWMxB
Programmable Pulse Width (OSHTWTH)
OSHT
EPWMxA
Sustaining Pulses
7 - 22
ePWM
Name PCCTL
Structure EPwmxRegs.PCCTL.all =
Chopper Clk Duty Cycle 000 = 1/8 (12.5%) 001 = 2/8 (25.0%) 010 = 3/8 (37.5%) 011 = 4/8 (50.0%) 100 = 5/8 (62.5%) 101 = 6/8 (75.0%) 110 = 7/8 (87.5%) 111 = reserved
15 - 11 reserved 10 - 8
1 2 3 4 5 6 7 8
4- 1 OSHTWTH
0 CHPEN
CHPDUTY
One-Shot Pulse Width SYSCLKOUT/8 1000 = 9 SYSCLKOUT/8 1001 = 10 SYSCLKOUT/8 1010 = 11 SYSCLKOUT/8 1011 = 12 SYSCLKOUT/8 1100 = 13 SYSCLKOUT/8 1101 = 14 SYSCLKOUT/8 1110 = 15 SYSCLKOUT/8 1111 = 16
x x x x x x x x
7 - 23
ePWM
Clock Prescaler
TBCTR . 15 - 0
Compare Register
Compare Register
AQCTLA . 11 - 0 AQCTLB . 11 - 0 DBCTL . 4 - 0
TBCLK
Compare Logic
Action Qualifier
Dead Band
EPWMxSYNCI
Period Register
Shadowed TBPRD . 15 - 0
EPWMxA
PWM Chopper
PCCTL . 10 - 0
Trip Zone
EPWMxB TZy TZSEL . 15 - 0
SYSCLKOUT
DSP core
EPWMxTZINT
EPWM1A EPWM1B EPWM2A EPWM2B EPWM3A EPWM3B EPWM4A EPWM4B EPWM5A EPWM5B EPWM6A EPWM6B
P W M O U T P U T S
The power drive protection is a safety feature that is provided for the safe operation of systems such as power converters and motor drives. It can be used to inform the monitoring program of
7 - 24
ePWM
motor drive abnormalities such as over-voltage, over-current, and excessive temperature rise. If the power drive protection interrupt is unmasked, the PWM output pins will be put in the highimpedance state immediately after the pin is driven low. An interrupt will also be generated.
Description Trip-Zone Control Trip-Zone Select Enable Interrupt Trip-Zone Flag Trip-Zone Clear Trip-Zone Force
15 - 4 reserved
3-2 TZB
1-0 TZA
TZ1 to TZ6 Action on EPWMxB / EPWMxA 00 = high impedance 01 = force high 10 = force low 11 = do nothing (disable)
7 - 25
ePWM
15 - 3 reserved
2 OST
1 CBC
0 reserved
7 - 26
ePWM
Clock Prescaler
TBCTR . 15 - 0
Compare Register
Compare Register
AQCTLA . 11 - 0 AQCTLB . 11 - 0 DBCTL . 4 - 0
TBCLK
Compare Logic
Action Qualifier
Dead Band
EPWMxSYNCI
Period Register
Shadowed TBPRD . 15 - 0
EPWMxA
PWM Chopper
PCCTL . 10 - 0
Trip Zone
EPWMxB TZy TZSEL . 15 - 0
SYSCLKOUT
. . . .
. . . .
EPWMA EPWMB
CTR = 0 CTR = PRD CTRU = CMPA CTRD = CMPA CTRU = CMPB CTRD = CMPB
7 - 27
ePWM
Description Event-Trigger Selection Event-Trigger Pre-Scale Event-Trigger Flag Event-Trigger Clear Event-Trigger Force
15 SOCBEN
14 - 12 SOCBSEL
11 SOCAEN
10 - 8 SOCASEL
7-4 reserved
3 INTEN
2-0 INTSEL
EPWMxSOCB / A Select 000 = reserved 001 = CTR = 0 010 = CTR = PRD 011 = reserved 100 = CTRU = CMPA 101 = CTRD = CMPA 110 = CTRU = CMPB 111 = CTRD = CMPB
EPWMxINT Select 000 = reserved 001 = CTR = 0 010 = CTR = PRD 011 = reserved 100 = CTRU = CMPA 101 = CTRD = CMPA 110 = CTRU = CMPB 111 = CTRD = CMPB
7 - 28
ePWM
EPWMxINT Counter
(number of events have occurred)
00 01 10 11
15 - 14 SOCBCNT
00 01 10 11
SOCAPRD
INTCNT INTPRD
EPWMxSOCB / A Period
(number of events before SOC)
EPWMxINT Period
(number of events before INT)
00 01 10 11
= disabled = SOC on first event = SOC on second event = SOC on third event
00 01 10 11
= disabled = INT on first event = INT on second event = INT on third event
HRPWM divides a clock cycle into smaller steps called Micro Steps (Step Size ~= 150 ps)
ms
ms
ms
ms
ms
ms
Calibration Logic
Calibration Logic tracks the number of Micro Steps per clock to account for variations caused by Temp/Volt/Process
Significantly increases the resolution of conventionally derived digital PWM Uses 8-bit extensions to Compare registers (CMPxHR) and Phase register (TBPHSHR) for edge positioning control Typically used when PWM resolution falls below ~9-10 bits which occurs at frequencies greater than ~300 kHz (with system clock of 150 MHz) Not all ePWM outputs support HRPWM feature (see device datasheet)
7 - 29
eCAP
eCAP
The capture units allow time-based logging of external TTL signal transitions on the capture input pins. The C28x has up to six capture units. Capture units can be configured to trigger an A/D conversion that is synchronized with an external event. There are several potential advantages to using the capture for this function over the ADCSOC pin associated with the ADC module. First, the ADCSOC pin is level triggered, and therefore only low to high external signal transitions can start a conversion. The capture unit does not suffer from this limitation since it is edge triggered and can be configured to start a conversion on either rising edges or falling edges. Second, if the ADCSOC pin is held high longer than one conversion period, a second conversion will be immediately initiated upon completion of the first. This unwanted second conversion could still be in progress when a desired conversion is needed. In addition, if the end-of-conversion ADC interrupt is enabled, this second conversion will trigger an unwanted interrupt upon its completion. These two problems are not a concern with the capture unit. Finally, the capture unit can send an interrupt request to the CPU while it simultaneously initiates the A/D conversion. This can yield a time savings when computations are driven by an external event since the interrupt allows preliminary calculations to begin at the start-of-conversion, rather than at the end-of-conversion using the ADC end-of-conversion interrupt. The ADCSOC pin does not offer a start-of-conversion interrupt. Rather, polling of the ADCSOC bit in the control register would need to be performed to trap the externally initiated start of conversion.
7 - 30
eCAP
vk
x tk - tk-1
Capture 1 Register
CAP2 . 31 - 0
Polarity Select 1
CAP2POL ECCTL . 2
Event Logic
TSCTR . 31 - 0
Capture 2 Register
CAP3 . 31 - 0
Polarity Select 2
CAP3POL ECCTL . 4
PRESCALE ECCTL . 13 - 9
Event Prescale
Capture 3 Register
SYSCLKOUT
CAP4 . 31 - 0
Polarity Select 3
CAP4POL ECCTL . 6
ECAPx pin
Capture 4 Register
Polarity Select 4
7 - 31
eCAP
immediate mode
TSCTR . 31 - 0
CAP3 . 31 - 0
shadow mode
ECAP pin
immediate mode
CAP4 . 31 - 0
shadow mode
7 - 32
eCAP
Upper Register:
CAP1 4 Load on Capture Event 0 = disable 1 = enable
15 - 14 FREE_SOFT 13 - 9 PRESCALE 8 CAPLDEN
Emulation Control 00 = TSCTR stops immediately 01 = TSCTR runs until equals 0 1X = free run (do not stop)
Event Filter Prescale Counter 00000 = divide by 1 (bypass) 00001 = divide by 2 00010 = divide by 4 00011 = divide by 6 00100 = divide by 8 11110 = divide by 60 11111 = divide by 62
Lower Register:
Counter Reset on Capture Event 0 = no reset (absolute time stamp mode) 1 = reset after capture (difference mode)
7 - 33
eCAP
Upper Register:
Capture / APWM mode 0 = capture mode 1 = APWM mode
15 - 11 reserved 10 APWMPOL 9 8
CAP_APWM SWSYNC
Software Force Counter Synchronization 0 = no effect 1 = TSCTR load of current module and other modules if SYNCO_SEL bits = 00
Lower Register:
Counter Sync-In 0 = disable 1 = enable
7-6 SYNCO_SEL 5 SYNCI_EN
Re-arm
(capture mode only)
Continuous/One-Shot
(capture mode only)
TSCTRSTOP
STOP_WRAP CONT_ONESHT
Sync-Out Select 00 = sync-in to sync-out 01 = CTR = PRD event generates sync-out 1X = disable
Stop Value for One-Shot Mode/ Wrap Value for Continuous Mode
(capture mode only)
00 01 10 11
7 - 34
eCAP
The capture unit interrupts offer immediate CPU notification of externally captured events. In situations where this is not required, the interrupts can be masked and flag testing/polling can be used instead. This offers increased flexibility for resource management. For example, consider a servo application where a capture unit is being used for low-speed velocity estimation via a pulsing sensor. The velocity estimate is not used until the next control law calculation is made, which is driven in real-time using a timer interrupt. Upon entering the timer interrupt service routine, software can test the capture interrupt flag bit. If sufficient servo motion has occurred since the last control law calculation, the capture interrupt flag will be set and software can proceed to compute a new velocity estimate. If the flag is not set, then sufficient motion has not occurred and some alternate action would be taken for updating the velocity estimate. As a second example, consider the case where two successive captures are needed before a computation proceeds (e.g. measuring the width of a pulse). If the width of the pulse is needed as soon as the pulse ends, then the capture interrupt is the best option. However, the capture interrupt will occur after each of the two captures, the first of which will waste a small number of cycles while the CPU is interrupted and then determines that it is indeed only the first capture. If the width of the pulse is not needed as soon as the pulse ends, the CPU can check, as needed, the capture registers to see if two captures have occurred, and proceed from there.
15 - 8
7 - 35
eQEP
eQEP
/4
Ch. A Ch. B
shaft rotation Incremental Optical Encoder Quadrature Output from Photo Sensors
The eQEP circuit, when enabled, decodes and counts the quadrature encoded input pulses. The QEP circuit can be used to interface with an optical encoder to get position and speed information from a rotating machine.
increment counter
10
decrement counter
Ch. A Ch. B
00
11
7 - 36
eQEP
Quadrature Capture
Quadrature clock mode Monitors the quadrature clock to indicate proper operation of the motion control system Direction count mode
EQEPxA/XCLK EQEPxB/XDIR
QEP Watchdog
Quadrature Decoder
EQEPxI EQEPxS
SYSCLKOUT
Position/Counter Compare
Generate a sync output and/or interrupt on a position compare match Generate the direction and clock for the position counter in quadrature count mode
eQEP Connections
Ch. A
Quadrature Capture
Ch. B
Index Strobe
from homing sensor
7 - 37
ADC
RESULT0
eCAP1
Capture 1 Register Capture 2 Register Capture 3 Register Capture 4 Register
ADCINA0
ePWM2 triggering ADC on period match using SOC A trigger every 20.833 s (48 kHz)
ePWM2
Procedure
Project File
1. A project named Lab7.pjt has been created for this lab. Open the project by clicking on Project Open and look in C:\C28x\Labs\Lab7. All Build Options have been configured the same as the previous lab. The files used in this lab are: Adc_6_7_8.c Gpio.c CodeStartBranch.asm Lab_5_6_7.cmd DefaultIsr_7.c Main_7.c DelayUs.asm PieCtrl_5_6_7_8_9_10.c DSP2833x_GlobalVariableDefs.c PieVect_5_6_7_8_9_10.c DSP2833x_Headers_nonBIOS.cmd SysCtrl.c ECap_7_8_9_10_12.c Watchdog.c EPwm_7_8_9_10_12.c
7 - 38
pointer rewind
data memory
...
8. Open and setup a graph to plot a 48-point window of the ADC results buffer. Click: View Graph Time/Frequency and set the following values: Start Address Acquisition Buffer Size Display Data Size DSP Data Type Sampling Rate (Hz) Time Display Unit Select OK to save the graph options. AdcBuf 48 48 16-bit unsigned integer 48000
s
7 - 39
9. The graphical display should show the generated 2 kHz, 25% duty cycle symmetric PWM waveform. The period of a 2 kHz signal is 500 s. You can confirm this by measuring the period of the waveform using the graph (you may want to enlarge the graph window using the mouse). The measurement is best done with the mouse. The lower left-hand corner of the graph window will display the X and Y-axis values. Subtract the X-axis values taken over a complete waveform period (you can use the PC calculator program found in Microsoft Windows to do this).
Display Type Start Address Acquisition Buffer Size FFT Framesize DSP Data Type Sampling Rate (Hz) Select OK to save the graph options.
11. On the plot window, left-click the mouse to move the vertical marker line and observe the frequencies of the different magnitude peaks. Do the peaks occur at the expected frequencies? 12. Fully halt the DSP (real-time mode) by using the GEL function: GEL Emulation Control Full_Halt.
Realtime
7 - 40
Check your files list to make sure the file is there. 14. In Main_7.c, add code to call the InitECap() function. There are no passed parameters or return values, so the call code is simply:
InitECap();
15. Edit Gpio.c and adjust the shared I/O pin in GPIO5 for the ECAP1 function. 16. Open and inspect the eCAP1 interrupt service routine (ECAP1_INT_ISR) in the file DefaultIsr_7.c. Notice that PwmDuty is calculated by CAP2 CAP1 (rising to falling edge) and that PwmPeriod is calculated by CAP3 CAP1 (rising to rising edge). 17. In ECap_7_8_9_10_12.c, setup eCAP1 to calculate PWM_duty and PWM_period. The following registers need to be modified: ECCTL2 (continuous mode, re-arm disable, and sync disable), ECCTL1 (set prescale to divide-by-1, configure capture event polarity without reseting the counter), and ECEINT (enable desired eCAP interrupt). 18. Using the PIE Interrupt Assignment Table find the location for the eCAP1 interrupt ECAP1_INT and fill in the following information:
PIE group #:
# within group:
This information will be used in the next step. 19. Modify the end of ECap_7_8_9_10_12.c to do the following: - Enable the ECAP1_INT interrupt in the PIE (Hint: use the PieCtrlRegs structure) - Enable the appropriate core interrupt in the IER register
25. Fully halt the DSP (real-time mode) by using the GEL function: GEL Emulation Control Full_Halt.
Realtime
7 - 41
Questions:
How do the captured values for PwmDuty and PwmPeriod relate to the compare register CMPA and time-base period TBPRD settings for ePWM1A? What is the value of PwmDuty in memory? What is the value of PwmPeriod in memory? How does it compare with the expected value?
End of Exercise
7 - 42
Numerical Concepts
Introduction
In this module, numerical concepts will be explored. One of the first considerations concerns multiplication how does the user store the results of a multiplication, when the process of multiplication creates results larger than the inputs. A similar concern arises when considering accumulation especially when long summations are performed. Next, floating-point concepts will be explored and IQmath will be described as a technique for implementing a virtual floatingpoint system to simplify the design process. The IQmath Library is a collection of highly optimized and high precision mathematical functions used to seamlessly port floating-point algorithms into fixed-point code. These C/C++ routines are typically used in computationally intensive real-time applications where optimal execution speed and high accuracy is needed. By using these routines a user can achieve execution speeds considerable faster than equivalent code written in standard ANSI C language. In addition, by incorporating the ready-to-use high precision functions, the IQmath library can shorten significantly a DSP application development time. (The IQmath user's guide is included in the application zip file, and can be found in the /docs folder once the file is extracted and installed).
Learning Objectives
Learning Objectives
Integers and Fractions IEEE-754 Floating-Point IQmath Format Conversion of ADC Results
8-1
Module Topics
Module Topics
Numerical Concepts .................................................................................................................................. 8-1 Module Topics......................................................................................................................................... 8-2 Numbering System Basics ....................................................................................................................... 8-3 Binary Numbers.................................................................................................................................. 8-3 Two's Complement Numbers ............................................................................................................. 8-3 Integer Basics ..................................................................................................................................... 8-4 Sign Extension Mode.......................................................................................................................... 8-5 Binary Multiplication.............................................................................................................................. 8-6 Binary Fractions ..................................................................................................................................... 8-8 Representing Fractions in Binary ....................................................................................................... 8-8 Fraction Basics ................................................................................................................................... 8-8 Multiplying Binary Fractions ............................................................................................................. 8-9 Fraction Coding.....................................................................................................................................8-11 Fractional vs. Integer Representation....................................................................................................8-12 Floating-Point........................................................................................................................................8-13 IQmath ...................................................................................................................................................8-16 IQ Fractional Representation.............................................................................................................8-16 Traditional Q Math Approach........................................................................................................8-17 IQmath Approach ..............................................................................................................................8-19 IQmath Library ......................................................................................................................................8-24 Converting ADC Results into IQ Format...............................................................................................8-26 AC Induction Motor Example ................................................................................................................8-28 IQmath Summary ...................................................................................................................................8-34 Lab 8: IQmath & Floating-Point FIR Filter..........................................................................................8-35
8-2
Binary Numbers
The binary numbering system is the simplest numbering scheme used in computers, and is the basis for other schemes. Some details about this system are: It uses only two values: 1 and 0 Each binary digit, commonly referred to as a bit, is one place in a binary number and represents an increasing power of 2. The least significant bit (LSB) is to the right and has the value of 1. Values are represented by setting the appropriate 1's in the binary number. The number of bits used determines how large a number may be represented.
Examples:
01102 = (0 * 8) + (1 * 4) + (1 * 2) + (0 * 1) = 610 111102 = (1 * 16) + (1 * 8) + (1 * 4) + (1 * 2) + (0 * 1) = 3010
Examples:
01102 = (0 * -8) + (1 * 4) + (1 * 2) + (0 * 1) = 610 111102 = (1 * -16) + (1 * 8) + (1 * 4) + (1 * 2) + (0 * 1) = -210
The same binary values are used in these examples for two's complement as were used above for binary. Notice that the decimal value is the same when the MSB is 0, but the decimal value is quite different when the MSB is 1. Two operations are useful in working with two's complement numbers: The ability to obtain an additive inverse of a value The ability to load small numbers into larger registers (by sign extending)
8-3
Examples:
Original No. 1. Load low 2. Sign Extend 0 1 1 02 0110 00000110 =4+2=6 = 610 1 1 1 1 02 11110 11111110 = -128 + 64 + ... + 2 = -2 = -210
Integer Basics
Integer Basics
2n-1 2n-1
233 2
222 2
211 2
200 2
8-4
1101 = -23 + 22 + 20 = -3
Load and sign extend
ACC
8-5
Binary Multiplication
Binary Multiplication
Now that you understand two's complement numbers, consider the process of multiplying two two's complement values. As with long hand decimal multiplication, we can perform binary multiplication one place at a time, and sum the results together at the end to obtain the total product. Note: This is not the method the C28x uses in multiplying numbers it is merely a way of observing how binary numbers work in arithmetic processes. The C28x uses 16-bit operands and a 32-bit accumulator. For the sake of clarity, consider the example below where we shall investigate the use of 4-bit values and an 8-bit accumulation:
4 -3
-12
11110100 11110100 ?
Data Memory
In this example, consider the following: What are the two input values, and the expected result? Why are the partial products shifted left as the calculation continues? Why is the final partial product different than the others? What is the result obtained when adding the partial products? How shall this result be loaded into the accumulator? How shall we fill the remaining bit? Is this value still the expected one? How can the result be stored back to memory? What problems arise?
8-6
Binary Multiplication
Note: With twos complement multiplication, the leading 1 in the second multiplicand is a sign bit. If the sign bit is 1, then take the 2s complement of the first multiplicand. Additionally, each partial product must be sign-extended for correct computation.
Note: All of the above questions except the final one are addressed in this module. The last question may have several answers:
Store the lower accumulator to memory. What problem is apparent using this method in this example? Store the upper accumulator back to memory. Wouldn't this create a loss of precision, and a problem in how to interpret the results later? Store both the upper and lower accumulator to memory. This solves the above problems, but creates some new ones: Extra code space, memory space, and cycle time are used How can the result be used as the input to a subsequent calculation? Is such a condition likely (consider any feedback system)?
From this analysis, it is clear that integers do not behave well when multiplied. Might some other type of number system behave better? Is there a number system where the results of a multiplication are bounded?
8-7
Binary Fractions
Binary Fractions
Given the problems associated with integers and multiplication, consider the possibilities of using fractional values. Fractions do not grow when multiplied, therefore, they remain representable within a given word size and solve the problem. Given the benefit of fractional multiplication, consider the issues involved with using fractions: How are fractions represented in two's complement? What issues are involved when multiplying two fractions?
Fraction Basics
Fraction Basics
-200 -2 2-1 2-1 2-2 2-2 2-3 2-3 2-(n-1) 2-(n-1)
8-8
Binary Fractions
Fraction Multiplication
0100 . x 1101 . 00000100 0000000 000100 11100 11110100
Accumulator x
1/2 -3/8
-3/16
Data Memory
As before, consider the following: What are the two input values and the expected result? As before, partial products are shifted left and the final is negative. How is the result (obtained when adding the partial products) read? How shall this result be loaded into the accumulator? How shall we fill the remaining bit? Is this value still the expected one? How can the result be stored back to memory? What problems arise?
To read the results of the fractional multiply, it is necessary to locate the binary point (the base 2 equivalent of the base 10 decimal point). Start by identifying the location of the binary point in the input values. The MSB is an integer and the next bit is 1/2, therefore, the binary point would be located between them. In our example, therefore, we would have three bits to the right of the binary point in each input value. For ease of description, we can refer to these as Q3 numbers, where Q refers to the number of places to the right of the point. When multiplying numbers, the Q values add. Thus, we would (mentally) place a binary point above the sixth LSB. We can now calculate the Q6 result more readily.
8-9
Binary Fractions
As with integers, the results are loaded low and the MSB is a sign extension of the seventh bit. If this value were loaded into the accumulator, we could store the results back to memory in a variety of ways: Store both low and high accumulator values back to memory. This offers maximum detail, but has the same problems as with integer multiply. Store only the high (or low) accumulator back to memory. This creates a potential for a memory littered with varying Q-types. Store the upper accumulator shifted to the left by 1. This would store values back to memory in the same Q format as the input values, and with equal precision to the inputs. How shall the left shift be performed? Heres three methods: Explicit shift (C or assembly code) Shift on store (assembly code) Use Product Mode shifter (assembly code)
8 - 10
Fraction Coding
Fraction Coding
Although COFF tools recognize values in integer, hex, binary, and other forms, they understand only integer, or non-fractional values. To use fractions within the C28x, it is necessary to describe them as though they were integers. This turns out to be a very simple trick. Consider the following number lines:
32768 (215)
Integer
void main(void) { int16 coef = 32768*707/1000; // 0.707 in Q15 int16 x, y; y = (int16)( (int32)coef * (int32)x ) >> 15); }
By multiplying a fraction by 32K (32768), a normalized fraction is created, which can be passed through the COFF tools as an integer. Once in the C28x, the normalized fraction looks and behaves exactly as a fraction. Thus, when using fractional constants in a C28x program, the coder first multiplies the fraction by 32768, and uses the resulting integer (rounded to the nearest whole value) to represent the fraction. The following is a simple, but effective method for getting fractions past the assembler: 1. Express the fraction as a decimal number (drop the decimal point). 2. Multiply by 32768. 3. Divide by the proper multiple of 10 to restore the decimal position.
Examples:
To represent 0.62: To represent 0.1405: 32768 x 62 / 100 32768 x 1405 / 10000
This method produces a valid number accurate to 16 bits. You will not need to do the math yourself, and changing values in your code becomes rather simple.
8 - 11
Integers grow when you multiply them Fractions have limited range
Fractions can still grow when you add them Scaling an application is time consuming
8 - 12
Floating-Point
Floating-Point
s eeeeeeee fffffffffffffffffffffff
8 bit exponent if e = 255 and f 0, if e = 255 and f = 0, if 0 < e < 255, if e = 0 and f 0, if e = 0 and f = 0, Case 1: Case 2: Case 3: Case 4: Case 5:
Normalized values
Case 3
Advantage Exponent gives large dynamic range Disadvantage Precision of a number depends on its exponent
Non-uniform distribution
Precision greatest near zero Less precision the further you get from zero
8 - 13
Floating-Point
Using Floating-Point
1)
2)
Should be using a C28x device with hardware floating-point support! Add the floating-point RTS library(s) to the CCS project
standard RTS lib (required)
rts2800_fpu32.lib comes with compiler
ASM: I16TOF32
C: (float)
31 15 0 s e e e e e e e e f f f f f f f f f f f f f f f f f f f f f f f 32-bit float
#define AdcFsVoltage float Result; void main(void) { // Convert unsigned 16-bit result to 32-bit float. // Scale result by 1/4096. Gives value of 0 to ~1. Gives value of 0 to ~3.0. // Scale result by AdcFsVoltage. } Gives value of 0 to 4095. float(3.0) // ADC full scale voltage // ADC result
Result = (AdcFsVoltage/4096.0)*(float)AdcMirror.ADCRESULT0;
8 - 14
Floating-Point
Disadvantages
Somewhat higher device cost May offer insufficient precision for some calculations due to 23 bit mantissa and the influence of the exponent
What if you dont have the luxury of using a floating-point C28x device?
8 - 15
IQmath
IQmath
Implementing complex digital control algorithms on a Digital Signal Processor (DSP), or any other DSP capable processor, typically come across the following issues: Algorithms are typically developed using floating-point math Floating-point devices are more expensive than fixed-point devices Converting floating-point algorithms to a fixed-point device is very time consuming Conversion process is one way and therefore backward simulation is not always possible
The design may initially start with a simulation (i.e. MatLab) of a control algorithm, which typically would be written in floating-point math (C or C++). This algorithm can be easily ported to a floating-point device, however because of cost reasons most likely a 16-bit or 32-bit fixedpoint device would be used in many target systems. The effort and skill involved in converting a floating-point algorithm to function using a 16-bit or 32-bit fixed-point device is quite significant. A great deal of time (many days or weeks) would be needed for reformatting, scaling and coding the problem. Additionally, the final implementation typically has little resemblance to the original algorithm. Debugging is not an easy task and the code is not easy to maintain or document.
IQ Fractional Representation
A new approach to fixed-point algorithm development, termed IQmath, can greatly simplify the design development task. This approach can also be termed virtual floating-point since it looks like floating-point, but it is implemented using fixed-point techniques.
IQ Fractional Representation
31 0
S IIIIIIII fffffffffffffffffffffff
32 bit mantissa
8 - 16
IQmath
The IQmath approach enables the seamless portability of code between fixed and floating-point devices. This approach is applicable to many problems that do not require a large dynamic range, such as motor or digital control applications.
Both floating-point and IQ formats have 232 possible values on the number line Its how each distributes these values that differs
M X B
<< 24
ssssI8 Q48
I16
Q48
>> 24
sssssssssssssssssI16
in C:
8 - 17
IQmath
The traditional approach to performing math operations, using fixed-point numerical techniques can be demonstrated using a simple linear equation example. The floating-point code for a linear equation would be:
float Y, M, X, B; Y = M * X + B;
For the fixed-point implementation, assume all data is 32-bits, and that the "Q" value, or location of the binary point, is set to 24 fractional bits (Q24). The numerical range and resolution for a 32-bit Q24 number is as follows: Q value Q24
(32-24)
-2
(32-24)
Compared to the floating-point representation, it looks quite cumbersome and has little resemblance to the floating-point equation. It is obvious why programmers prefer using floating-point math. The slide shows the implementation of the equation on a processor containing hardware that can perform a 32x32 bit multiplication, 64-bit addition and 64-bit shifts (logical and arithmetic) efficiently. The basic approach in traditional fixed-point "Q" math is to align the binary point of the operands that get added to or subtracted from the multiplication result. As shown in the slide, the multiplication of M and X (two Q24 numbers) results in a Q48 value that is stored in a 64-bit register. The value B (Q24) needs to be scaled to a Q48 number before addition to the M*X value (low order bits zero filled, high order bits sign extended). The final result is then scaled back to a Q24 number (arithmetic shift right) before storing into Y (Q24). Many programmers may be familiar with 16-bit fixed-point "Q" math that is in common use. The same example using 16-bit numbers with 15 fractional bits (Q15) would be coded as follows:
int16 Y, M, X, B; // numbers are all Q15 Y = ((int32) M * (int32) X + (int32) B << 15) >> 15;
In both cases, the principal methodology is the same. The binary point of the operands that get added to or subtracted from the multiplication result must be aligned.
8 - 18
IQmath
IQmath Approach
M X
>> 24
Q24
I8
Q24
I8
Q24
I8
Q24
in C:
In the "IQmath" approach, rather then scaling the operands, which get added to or subtracted from the multiplication result, we do the reverse. The multiplication result binary point is scaled back such that it aligns to the operands, which are added to or subtracted from it. The C code implementation of this is given by linear equation below:
int32 Y, M, X, B; Y = ((int64) M * (int64) X) >> 24 + B;
The slide shows the implementation of the equation on a processor containing hardware that can perform a 32x32 bit multiply, 32-bit addition/subtraction and 64-bit logical and arithmetic shifts efficiently. The key advantage of this approach is shown by what can then be done with the C and C++ compiler to simplify the coding of the linear equation example. Lets take an additional step and create a multiply function in C that performs the following operation:
int32 _IQ24mpy(int32 M, int32 X) { return ((int64) M * (int64) X) >> 24; }
Already we can see a marked improvement in the readability of the linear equation.
8 - 19
IQmath
Using the operator overloading features of C++, we can overload the multiplication operand "*" such that when a particular data type is encountered, it will automatically implement the scaled multiply operation. Lets define a data type called "iq" and assign the linear variables to this data type:
iq Y, M, X, B // numbers are all Q24
This final equation looks identical to the floating-point representation. It looks "natural". The four approaches are summarized in the table below: Math Implementations 32-bit floating-point math in C 32-bit fixed-point "Q" math in C 32-bit IQmath in C 32-bit IQmath in C++ Linear Equation Code Y = M * X + B; Y = ((int64) M * (int64) X) + (int64) B << 24) >> 24; Y = _IQ24mpy(M, X) + B; Y = M * X + B;
Essentially, the mathematical approach of scaling the multiplier operand enables a cleaner and a more "natural" approach to coding fixed-point problems. For want of a better term, we call this approach "IQmath" or can also be described as "virtual floating-point".
8 - 20
IQmath
IQmath Approach
Multiply Operation
Y = ((i64) M * (i64) X) >> Q + B;
IQmath Approach
It looks like floating-point!
Y, M, X, B;
Floating-Point
float
Y = M * X + B;
Y = _IQmpy(M, X) + B; iq Y, M, X, B;
Y = M * X + B;
8 - 21
IQmath
IQmath Approach
GLOBAL_Q simplification
User selects Global Q value for the whole application
GLOBAL_Q
based on the required dynamic range or resolution, for example:
GLOBAL_Q 28 24 20 #define _iq Max 7.999 127.999 2047.999 Val Min Val Resolution 999 996 -8.000 000 000 0.000 000 004 999 94 -128.000 000 00 0.000 000 06 999 -2048.000 000 0.000 001 18 // set in IQmathLib.h file
GLOBAL_Q
Y = _IQmpy(M,X) + B;
The basic "IQmath" approach was adopted in the creation of a standard math library for the Texas Instruments TMS320C28x DSP fixed-point processor. This processor contains efficient hardware for performing 32x32 bit multiply, 64-bit shifts (logical and arithmetic) and 32-bit add/subtract operations, which are ideally suited for 32 bit "IQmath". Some enhancements were made to the basic "IQmath" approach to improve flexibility. They are: Setting of GLOBAL_Q Parameter Value: Depending on the application, the amount of numerical resolution or dynamic range required may vary. In the linear equation example, we used a Q value of 24 (Q24). There is no reason why any value of Q can't be used. In the "IQmath" library, the user can set a GLOBAL_Q parameter, with a range of 1 to 30 (Q1 to Q30). All functions used in the program will use this GLOBAL_Q value. For example:
#define GLOBAL_Q 18 Y = _IQmpy(M, X) + B; // all values use GLOBAL_Q = 18
If, for some reason a particular function or equation requires a different resolution, then the user has the option to implicitly specify the Q value for the operation. For example:
Y = _IQ23mpy(M,X) + B; // all values use Q23, including B and Y
The Q value must be consistent for all expressions in the same line of code.
8 - 22
IQmath
Y = _IQmpy(M, X) + B;
2) Select math type in IQmathLib.h #if MATH_TYPE == IQ_MATH #if MATH_TYPE == FLOAT_MATH
3) Compiler automatically converts to: Y = (float)M * (float)X + (float)B; Fixed-Point Math Code Floating-Point Math Code
Selecting FLOAT_MATH or IQ_MATH Mode: As was highlighted in the introduction, we would ideally like to be able to have a single source code that can execute on a floating-point or fixedpoint target device simply by recompiling the code. The "IQmath" library supports this by setting a mode, which selects either IQ_MATH or FLOAT_MATH. This operation is performed by simply redefining the function in a header file. For example:
#if MATH_TYPE == IQ_MATH #define _IQmpy(M , X) _IQmpy(M , X) #elseif MATH_TYPE == FLOAT_MATH #define _IQmpy(M , X) (float) M * (float) X #endif
Essentially, the programmer writes the code using the "IQmath" library functions and the code can be compiled for floating-point or "IQmath" operations.
8 - 23
IQmath Library
IQmath Library
Floating-Point
float A, B; A = 1.2345 A* B A/ B A+B A-B
>, >=, <, <=, ==, |=, &&, ||
IQmath in C
_iq A, B; A = _IQ(1.2345) _IQmpy(A , B) _IQdiv (A , B) A +B A- B
>, >=, <, <=, ==, |=, &&, ||
IQmath in C++
iq A, B; A = IQ(1.2345) A*B A/ B A+ B AB
>, >=, <, <=, ==, |=, &&, ||
sin(A),cos(A) _IQsin(A), _IQcos(A) IQsin(A),IQcos(A) sin(A*2pi),cos(A*2pi) _IQsinPU(A), _IQcosPU(A) IQsinPU(A),IQcosPU(A) asin(A),acos(A) _IQasin(A),_IQacos(A) IQasin(A),IQacos(A) atan(A),atan2(A,B) _IQatan(A), _IQatan2(A,B) IQatan(A),IQatan2(A,B) atan2(A,B)/2pi _IQatan2PU(A,B) IQatan2PU(A,B) sqrt(A),1/sqrt(A) _IQsqrt(A), _IQisqrt(A) IQsqrt(A),IQisqrt(A) sqrt(A*A + B*B) _IQmag(A,B) IQmag(A,B) exp(A) _IQexp(A) IQexp(A) if(A > Pos) A = Pos if(A < Neg) A = Neg _IQsat(A,Pos,Neg) IQsat(A,Pos,Neg)
saturation
Additionally, the "IQmath" library contains DSP library modules for filters (FIR & IIR) and Fast Fourier Transforms (FFT & IFFT).
Floating-Point
A A (long) A A (long) A A * (float) B (long) (A * (float) B)
IQmath in C
_IQtoIQN(A) _IQNtoIQ(A) _IQint(A) _IQfrac(A) _IQmpyI32(A,B) _IQmpyI32int(A,B)
IQmath in C++
IQtoIQN(A) IQNtoIQ(A) IQint(A) IQfrac(A) IQmpyI32(A,B) IQmpyI32int(A,B) IQmpyI32frac(A,B) QNtoIQ(A) IQtoQN(A) atoIQ(char) IQtoF(A) IQtoA(A,B,C)
fraction(iq*long) A - (long) (A * (float) B) _IQmpyI32frac(A,B) qN to iq A _QNtoIQ(A) iq to qN A _IQtoQN(A) string to iq atof(char) _atoIQ(char) IQ to float A _IQtoF(A) IQ to ASCII sprintf(A,B,C) _IQtoA(A,B,C)
> contains library of math functions > C header file > C++ header file
8 - 24
IQmath Library
16 vs. 32 Bits
The "IQmath" approach could also be used on 16-bit numbers and for many problems, this is sufficient resolution. However, in many control cases, the user needs to use many different "Q" values to accommodate the limited resolution of a 16-bit number. With DSP devices like the TMS320C28x processor, which can perform 16-bit and 32-bit math with equal efficiency, the choice becomes more of productivity (time to market). Why bother spending a whole lot of time trying to code using 16-bit numbers when you can simply use 32-bit numbers, pick one value of "Q" that will accommodate all cases and not worry about spending too much time optimizing. Of course there is a concern on data RAM usage if numbers that could be represented in 16 bits all use 32 bits. This is becoming less of an issue in today's processors because of the finer technology used and the amount of RAM that can be cheaply integrated. However, in many cases, this problem can be mitigated by performing intermediate calculations using 32-bit numbers and converting the input from 16 to 32 bits and converting the output back to 16 bits before storing the final results. In many problems, it is the intermediate calculations that require additional accuracy to avoid quantization problems.
8 - 25
As you may recall, the converted values of the ADC can be placed in the upper 12 bit of the RESULT0 register (when not using AdcMirror register). Before these values are filtered using the IQmath library, they need to to be put into the IQ format as a 32-bit long. For uni-polar ADC inputs (i.e., 0 to 3 V inputs), a conversion to global IQ format can be achieved with:
IQresult_unipolar = _IQmpy(_IQ(3.0),_IQ12toIQ((_iq) AdcRegs.ADCRESULT0));
How can we modify the above to recover bi-polar inputs, for example +-1.5 volts? One could do the following to offset the +1.5V analog biasing applied to the ADC input:
IQresult_bipolar = _IQmpy(_IQ(3.0),_IQ12toIQ((_iq) AdcRegs.ADCRESULT0)) - _IQ(1.5);
However, one can see that the largest intermediate value the equation above could reach is 3.0. This means that it cannot be used with an IQ data type of IQ30 (IQ30 range is -2 < x < ~2). Since the IQmath library supports IQ types from IQ1 to IQ30, this could be an issue in some applications. The following clever approach supports IQ types from IQ1 to IQ30:
IQresult_bipolar = _IQmpy(_IQ(1.5),_IQ15toIQ((_iq) ((int16) (AdcRegs.ADCRESULT0 ^ 0x8000))));
The largest intermediate value that this equation could reach is 1.5. Therefore, IQ30 is easily supported.
8 - 26
Can a Single ADC Interface Code Line be Written for IQmath and Floating-Point?
#if MATH_TYPE == IQ_MATH #define AdcFsVoltage #else #endif _iq Result; void main(void) { Result = _IQmpy(AdcFsVoltage, _IQ12toIQ( (_iq)AdcMirror.ADCRESULT0)); } // ADC result #define AdcFsVoltage _IQ(3.0) _IQ(3.0/4096.0) // ADC full scale voltage // ADC full scale voltage // MATH_TYPE is FLOAT_MATH
FLOAT_MATH behavior:
does nothing
float
8 - 27
Sensorless, ACI induction machine direct rotor flux control Goal: motor speed estimation & alpha-axis stator current estimation
The "IQmath" approach is ideally suited for applications where a large numerical dynamic range is not required. Motor control is an example of such an application (audio and communication algorithms are other applications). As an example, the IQmath approach has been applied to the sensor-less direct field control of an AC induction motor. This is probably one of the most challenging motor control problems and as will be shown later, requires numerical accuracy greater then 16-bits in the control calculations. The above slide is a block diagram representation of the key control blocks and their interconnections. Essentially this system implements a "Forward Control" block for controlling the d-q axis motor current using PID controllers and a "Feedback Control" block using back emf's integration with compensated voltage from current model for estimating rotor flux based on current and voltage measurements. The motor speed is simply estimated from rotor flux differentiation and openloop slip computation. The system was initially implemented on a "Simulator Test Bench" which uses a simulation of an "AC Induction Motor Model" in place of a real motor. Once working, the system was then tested using a real motor on an appropriate hardware platform. Each individual block shown in the slide exists as a stand-alone C/C++ module, which can be interconnected to form the complete control system. This modular approach allows reusability and portability of the code. The next few slides show the coding of one particular block, PARK Transform, using floating-point and "IQmath" approaches in C:
8 - 28
#include math.h
#define
TWO_PI
6.28318530717959
void park_calc(PARK *v) { float cos_ang , sin_ang; sin_ang = sin(TWO_PI * v->ang); cos_ang = cos(TWO_PI * v->ang);
v->de = (v->ds * cos_ang) + (v->qs * sin_ang); v->qe = (v->qs * cos_ang) - (v->ds * sin_ang); }
void park_calc(PARK *v) { float cos_ang , sin_ang; _iq _IQsin(_IQmpy(TWO_PI sin_ang = sin(TWO_PI * v->ang);, v->ang)); _IQcos(_IQmpy(TWO_PI cos_ang = cos(TWO_PI * v->ang);, v->ang)); v->de = _IQmpy(v->ds , cos_ang) + _IQmpy(v->qs , sin_ang); (v->ds * cos_ang) + (v->qs * sin_ang); v->qe = _IQmpy(v->qs , cos_ang) - _IQmpy(v->ds , sin_ang); (v->qs * cos_ang) - (v->ds * sin_ang); }
The complete system was coded using "IQmath". Based on analysis of coefficients in the system, the largest coefficient had a value of 33.3333. This indicated that a minimum dynamic range of 7 bits (+/-64 range) was required. Therefore, this translated to a GLOBAL_Q value of 32-7 = 25 (Q25). Just to be safe, the initial simulation runs were conducted with GLOBAL_Q = 24 (Q24)
8 - 29
value. The plots start from a step change in reference speed from 0.0 to 0.5 and 1024 samples are taken.
IQmath: speed
IQmath: current
Floating-Point: speed
Floating-Point: current
The speed eventually settles to the desired reference value and the stator current exhibits a clean and stable oscillation. The block diagram slide shows at which points in the control system the plots are taken from.
+ I8Q24 Fractions:
0
Same precision as I8Q24
In the region where these particular computations occur, the precision of single-precision floating-point just happens to equal the precision of the I8Q24 format. So, both produce similar results!
8 - 30
IQmath: speed
IQmath: current
IQmath: speed
IQmath: current
8 - 31
With the ability to select the GLOBAL_Q value for all calculations in the "IQmath", an experiment was conducted to see what maximum and minimum Q value the system could tolerate before it became unstable. The results are tabulated in the slide below:
Q range
Q31 to Q27 Q26 to Q19 Q18 to Q0
Stability Range
Unstable
(not enough dynamic range)
Stable Unstable
(not enough resolution, quantization problems)
The above indicates that, the AC induction motor system that we simulated requires a minimum of 7 bits of dynamic range (+/-64) and requires a minimum of 19 bits of numerical resolution (+/0.000002). This confirms our initial analysis that the largest coefficient value being 33.33333 required a minimum dynamic range of 7 bits. As a general guideline, users using IQmath should examine the largest coefficient used in the equations and this would be a good starting point for setting the initial GLOBAL_Q value. Then, through simulation or experimentation, the user can reduce the GLOBAL_Q until the system resolution starts to cause instability or performance degradation. The user then has a maximum and minimum limit and a safe approach is to pick a midpoint. What the above analysis also confirms is that this particular problem does require some calculations to be performed using greater then 16 bit precision. The above example requires a minimum of 7 + 19 = 26 bits of numerical accuracy for some parts of the calculations. Hence, if one was implementing the AC induction motor control algorithm using a 16 bit fixed-point DSP, it would require the implementation of higher precision math for certain portions. This would take more cycles and programming effort. The great benefit of using GLOBAL_Q is that the user does not necessarily need to go into details to assign an individual Q for each variable in a whole system, as is typically done in conventional fixed-point programming. This is time consuming work. By using 32-bit resolution and the "IQmath" approach, the user can easily evaluate the overall resolution and quickly implement a typical digital motor control application without quantization problems.
8 - 32
B1: ACI module cycles B2: Feedforward control cycles B3: Feedback control cycles Total control cycles (B2+B3) % of available MHz used (20 kHz control loop)
Notes: C28x compiled on codegen tools v5.0.0, -g (debug enabled), -o3 (max. optimization) fast RTS lib v1.0beta1 IQmath lib v1.4d
Using the profiling capabilities of the respective DSP tools, the table above summarizes the number of cycles and code size of the forward and feedback control blocks. The MIPS used is based on a system sampling frequency of 20 kHz, which is typical of such systems.
8 - 33
IQmath Summary
IQmath Summary
One source code set for simulation vs. target device Numerical resolution adjustability based on application requirement
Set in IQmathLib.h file
#define GLOBAL_Q 18
Numerical accuracy without sacrificing time and cycles Rapid conversion/porting and implementation of algorithms
IQmath library is freeware - available from TI DSP website http://www.ti.com/c2000
The IQmath approach, matched to a fixed-point processor with 32x32 bit capabilities enables the following: Seamless portability of code between fixed and floating-point devices Maintenance and support of one source code set from simulation to target device Adjustability of numerical resolution (Q value) based on application requirement Implementation of systems that may otherwise require floating-point device Rapid conversion/porting and implementation of algorithms
8 - 34
ADC
RESULT0
FIR Filter
connector wire
ePWM2 triggering ADC on period match using SOC A trigger every 20.833 s (48 kHz)
data memory
ePWM2
pointer rewind
...
Procedure
Project File
1. A project named Lab8.pjt has been created for this lab. Open the project by clicking on Project Open and look in C:\C28x\Labs\Lab8. All Build Options have been configured the same as the previous lab. The files used in this lab are: Adc_6_7_8.c Filter.c CodeStartBranch.asm Gpio.c DefaultIsr_8.c Lab_8.cmd DelayUs.asm Main_8.c DSP2833x_GlobalVariableDefs.c PieCtrl_5_6_7_8_9_10.c DSP2833x_Headers_nonBIOS.cmd PieVect_5_6_7_8_9_10.c ECap_7_8_9_10_12.c SysCtrl.c EPwm_7_8_9_10_12.c Watchdog.c
8 - 35
Include IQmathLib.h
4. In the CCS project window left click the plus sign (+) to the left of the Include folder. Edit Lab.h to uncomment the line that includes the IQmathLib.h header file. Next, in the Function Prototypes section, uncomment the function prototype for IQssfir(), the IQ math single-sample FIR filter function. In the Global Variable References section uncomment the four _iq references, and comment out the reference to AdcBuf[ADC_BUF_LEN]. Save the changes and close the file.
Inspect Lab_8.cmd
5. Open and inspect Lab_8.cmd. First, notice that a section called IQmath is being linked to L0123SARAM. The IQmath section contains the IQmath library functions (code). Second, notice that a section called IQmathTables is being linked to the IQTABLES with a TYPE = NOLOAD modifier after its allocation. The IQmath tables are used by the IQmath library functions. The NOLOAD modifier allows the linker to resolve all addresses in the section, but the section is not actually placed into the .out file. This is done because the section is already present in the device ROM (you cannot load data into ROM after the device is manufactured!). The tables were put in the ROM by TI when the device was manufactured. All we need to do is link the section to the addresses where it is known to already reside (the tables are the very first thing in the BOOT ROM, starting at address 0x3FE000).
8 - 36
Recall that this Q type will provide 8 integer bits and 24 fractional bits. Dynamic range is therefore -128 < x < +128, which is sufficient for our purposes in the workshop.
8 - 37
Note:
For the next step, check to be sure that the jumper wire connecting PWM1A (pin # P8-9) to ADCINA0 (pin # P9-2) is in place on the eZdsp.
12. Run the code in real-time mode using the GEL function: GEL Realtime Emulation Control Run_Realtime_with_Reset, and watch the memory window update. Verify that the ADC result buffer contains updated values. 13. Open and setup a dual-time graph to plot a 48-point window of the filtered and unfiltered ADC results buffer. Click: View Graph Time/Frequency and set the following values: Display Type Dual Time
Start Address upper display AdcBufFiltered Start Address lower display AdcBuf Acquisition Buffer Size Display Data Size DSP Data Type Q-value Sampling Rate (Hz) Time Display Unit Select OK to save the graph options. 14. The graphical display should show the generated FIR filtered 2 kHz, 25% duty cycle symmetric PWM waveform in the upper display and the unfiltered waveform generated in the previous lab exercise in the lower display. Notice the shape and phase differences between the waveform plots (the filtered curve has rounded edges, and lags the unfiltered plot by several samples). The amplitudes of both plots should run from 0 to 3.0. 15. Open and setup two (2) frequency domain plots one for the filtered and another for the unfiltered ADC results buffer. Click: View Graph Time/Frequency and set the following values: 48 48 32-bit signed integer 24 48000 s
8 - 38
GRAPH #1 Display Type Start Address Acquisition Buffer Size FFT Framesize DSP Data Type Q-value Sampling Rate (Hz) Select OK to save the graph options. FFT Magnitude AdcBuf 48 48 32-bit signed integer 24 48000
16. The graphical displays should show the frequency components of the filtered and unfiltered 2 kHz, 25% duty cycle symmetric PWM waveforms. Notice that the higher frequency components are reduced using the Low-Pass FIR filter in the filtered graph as compared to the unfiltered graph. 17. Fully halt the DSP (real-time mode) by using the GEL function: GEL Emulation Control Full_Halt. Realtime
Save the change to the IQmathLib.h and close the file. 19. Open the Build Options and select the Compiler tab. In the Advanced Category set the Floating Point Support to fpu32. 20. In the Build Options now select the Linker tab. In the Libraries Category change from rts2800_ml.lib to rts2800_fpu32.lib. This library is required for floatingpoint support. Select OK to save the Build Options.
8 - 39
8 - 40
Bode Plot of Digital Low-Pass FIR Filter Coefficients: [1/16, 4/16, 6/16, 4/16, 1/16] Sample Rate: 48 kHz
8 - 41
8 - 42
Learning Objectives
Learning Objectives
Understand the operation of the Direct Memory Access (DMA) controller Show how to use the DMA to transfer data between peripherals and/or memory without intervention from the CPU
9-1
Module Topics
Module Topics
Direct Memory Access Controller ........................................................................................................... 9-1 Module Topics......................................................................................................................................... 9-2 Direct Memory Access (DMA)................................................................................................................ 9-3 Basic Operation .................................................................................................................................. 9-4 DMA Examples .................................................................................................................................. 9-6 DMA Priority Modes.......................................................................................................................... 9-9 DMA Throughput..............................................................................................................................9-10 DMA Registers..................................................................................................................................9-11 Lab 9: Servicing the ADC with DMA.....................................................................................................9-15
9-2
ADC
Result 0-15
XINTF
Zone 0, 6, 7
DMA
L4 SARAM L5 SARAM L6 SARAM L7 SARAM
SysCtrlRegs.MAPCNF.bit.MAPCNF (re-maps PWM regs from PF1 to PF3)
6-channels McBSP-A
Triggers
McBSP-B
SEQ1INT / SEQ2INT MXEVTA / MREVTA MXEVTB / MREVTB XINT1-7 / 13 TINT0 / 1 / 2
DMA Definitions
Word
16 or 32 bits Word size is configurable per DMA channel
Burst
Consists of multiple words Smallest amount of data transferred at one time
Burst Size
Number of words per burst Specified by BURST_SIZE register
5-bit N-1 value (maximum of 32 words/burst)
Transfer
Consists of multiple bursts
Transfer Size
Number of bursts per transfer Specified by TRANSFER_SIZE register
16-bit N-1 value - exceeds any practical requirements
9-3
Basic Operation
9-4
Read/Write Data
Moved Burst Size Words? Y Moved Transfer Size Bursts? Y End Transfer
DMA Interrupts
Mode #1: Interrupt at start of transfer
Each DMA channel has its own PIE interrupt The mode for each interrupt can be configured individually The CHINTMODE bit in the MODE register selects the interrupt mode
Start Transfer
Read/Write Data
Y End Transfer
9-5
DMA Examples
Simple Example
Objective: Move 4 words from L7 SARAM to XINTF Zone 0 and interrupt CPU at end of transfer
BURST_SIZE* TRANSFER_SIZE* * Size registers are N-1 0x0001 0x0001 2 words/burst 2 bursts/transfer
St art Transfer
Source Registers
SRC_ADDR SRC_ADDR_SHADOW BURST_STEP TRANSFER_STEP 0x00000000 0x0000F000 0x0000F001 0x0000F002 0x0000F003 0x0000F000 0x0001 0x0001
Destination Registers
DST_ADDR DST_ADDR_SHADOW DST_BURST_STEP DST_TRANSFER_STEP 0x00000000 0x00004000 0x00004001 0x00004002 0x00004003 0x00004000 0x0001 0x0001
Interrupt to PIE
End Trans fer
Note: This example could also have been done using 1 word/burst and 4 bursts/transfer, or 4 words/burst and 1 burst/transfer. This would affect Round-Robin progression, but not interrupts.
CH0
CH1 0x0B00 0x0B01 0x0B02 0x0B03 0x0B04 CH0 CH1 CH2 CH3 CH4
CH2
CH3
CH4
9-6
DMA Registers:
BURST_SIZE* TRANSFER_SIZE* SRC_ADDR_SHADOW SRC_BURST_STEP SRC_TRANSFER_STEP DST_ADDR_SHADOW DST_BURST_STEP DST_TRANSFER_STEP 0x0004 0x0002 0x00000B00 0x0001 0xFFFC 0x0000F000 0x0003 0xFFF5 5 words/burst 3 bursts/transfer
L7 SARAM 0xF000 0xF001 0xF002 0xF003 0xF004 0xF005 0xF006 0xF007 0xF008 0xF009 0xF00A 0xF00B 0xF00C 0xF00D 0xF00E CH0 CH0 CH0 CH1 CH1 CH1 CH2 CH2 CH2 CH3 CH3 CH3 CH4 CH4 CH4
* Size registers are N-1 ** Typically use a relocatable symbol in your code, not a hard value
Wrap Function: Reloads address pointer after specified number of bursts Allows a cumulative signed offset to be added each wrap
Wait for event to start/continue transfer Read/Write Data Moved Burst Size Words? Y N Add Burst Step to Address Pointer Add Transfer Step to Address Pointer
New Registers WRAP_SIZE = bursts/wrap - 1 BEG_ADDR = Wrap beginning address WRAP_STEP = added to BEG_ADDR before wrapping
N Moved Wrap Size Bursts? Y Add WRAP_STEP to BEG_ADDR and Load into Address Pointer
9-7
L4 SARAM
0xC140
48 word Ping buffer DMA Interrupt 48 word Pong buffer DMA Interrupt
DMA Registers:
BURST_SIZE* TRANSFER_SIZE* SRC_ADDR_SHADOW SRC_BURST_STEP SRC_TRANSFER_STEP SRC_BEG_ADDR_SHADOW SRC_WRAP_SIZE* SRC_WRAP_STEP DST_ADDR_SHADOW DST_BURST_STEP DST_TRANSFER_STEP DST_BEG_ADDR_SHADOW DST_WRAP_SIZE* DST_WRAP_STEP 0x0000 0x002F 0x00000B00 dont care 0x0001 0x00000B00 0x000F 0x0000 0x0000C140 0x0000 0x0001 dont care 0xFFFF dont care 1 word/burst 48 bursts/transfer starting address since BURST_SIZE = 0 starting wrap address wrap after 16 words starting address** since BURST_SIZE = 0 not using dst wrap no wrap not using dst wrap
Start T ransfer Wait for event to star t/continue transfer Read/W rite Data Moved Bur st Size Words? Y Moved Transfer Size Bursts? Y End Transfer N N Add Burst Step to Address Pointer Add Transfer Step to Address Pointer N Moved Wrap Size Bur sts? Y Add W RAP_STEP to BEG_ADDR and Load Into Address Pointer
Other: DMA configured to re-init after transfer (CONTINUOUS = 1) * Size registers are N-1 ** DST_ADDR_SHADOW must be changed between ping and pong buffer address in the DMA ISR. Typically use a relocatable symbol in your code, not a hard value.
9-8
CH1
CH5
CH2
CH4
CH3
Read/Write Data Point where CH1 can interrupt other channels in CH1 Priority Mode
Moved Burst Size Words? Y Moved Transfer Size Bursts? Y End Transfer
9-9
DMA Throughput
DMA Throughput
4 cycles/word (5 for McBSP reads) 1 cycle delay to start each burst 1 cycle delay returning from CH1 high priority interrupt 32-bit transfer doubles throughput
(except McBSP, which supports 16-bit transfers only) Example: 128 16-bit words from ADC to RAM
8 bursts * [(4 cycles/word * 16 words/burst) + 1] = 520 cycles
9 - 10
DMA Registers
DmaRegs.name (lab file: Dma.c)
Register DMACTRL PRIORITYCTRL1 MODE CONTROL BURST_SIZE DMA CHx Registers BURST_COUNT SRC_BURST_STEP DST_BURST_STEP TRANSFER_SIZE TRANSFER_COUNT SRC_TRANSFER_STEP DST_TRANSFER_STEP SRC_ADDR_SHADOW SRC_ADDR DST_ADDR_SHADOW DST_ADDR Description DMA Control Register Priority Control Register 1 Mode Register Control Register Burst Size Register Burst Count Register Source Burst Step Size Register Destination Burst Step Size Register Transfer Size Register Transfer Count Register Source Transfer Step Size Register Destination Transfer Step Size Register Shadow Source Address Pointer Register Active Source Address Pointer Register Shadow Destination Address Pointer Register Active Destination Address Pointer Register
DMA Registers
For a complete list of registers refer to the DMA Module Reference Guide
15 - 2 reserved
1 PRIORITYRESET
0 HARDRESET
Priority Reset 0 = writes ignored (always reads back 0) 1 = reset state-machine after any pending burst transfer complete
9 - 11
15 - 1 reserved
0 CH1PRIORITY
DMA CH1 Priority 0 = same priority as other channels 1 = highest priority channel
Mode Register
DmaRegs.CHx.MODE
Upper Register:
Channel Interrupt 0 = disable 1 = enable
15 CHINTE 14 DATASIZE
One Shot Mode Sync Mode Select 0 = one burst transfer per trigger 0 = SRC wrap counter 1 = subsequent burst transfers 1 = DST wrap counter occur without additional trigger
13 SYNCSEL 12 SYNCE 11 CONTINUOUS 10 ONESHOT
9 - 12
Mode Register
DmaRegs.CHx.MODE
Lower Register:
Channel Interrupt Generation 0 = at beginning of transfer 1 = at end of transfer
9 8 7
Control Register
DmaRegs.CHx.CONTROL
Upper Register:
Overflow Flag * Burst Status * Sync Error * Sync Flag * 0 = no overflow 0 = no activity 0 = no error 0 = no sync event 1 = overflow 1 = servicing burst 1 = ADCSYNC error 1 = ADCSYNC event
15 14 13 12 11 10 9 8
Run Status * Transfer Status * Peripheral Interrupt Trigger Flag * 0 = channel disabled 0 = no activity 0 = no interrupt event trigger 1 = channel enabled 1 = transferring 1 = interrupt event trigger
* = read-only
9 - 13
Control Register
DmaRegs.CHx.CONTROL
Lower Register:
Error Clear 0 = no effect 1 = clear SYNCERR
7 6
Sync Clear 0 = no effect 1 = clear SYNCFLG Peripheral Interrupt Clear 0 = no effect 1 = clears event and PERINTFLG
9 - 14
ADC
RESULT0
DMA
connector wire
ePWM2 triggering ADC on period match using SOC A trigger every 20.833 s (48 kHz)
FIR Filter
Pointer rewind
Objective: Configure the DMA to buffer ADC Channel A0 ping-pong style with 48 samples per buffer
Procedure
Project File
1. A project named Lab9.pjt has been created for this lab. Open the project by clicking Open and look in C:\C28x\Labs\Lab9. All Build Options on Project have been configured the same as the previous lab. The files used in this lab are: Adc_9_10_12.c Filter.c CodeStartBranch.asm Gpio.c DefaultIsr_9_10_12a.c Lab_9.cmd DelayUs.asm Main_9.c Dma.c PieCtrl_5_6_7_8_9_10.c DSP2833x_GlobalVariableDefs.c PieVect_5_6_7_8_9_10.c DSP2833x_Headers_nonBIOS.cmd SysCtrl.c ECap_7_8_9_10_12.c Watchdog.c EPwm_7_8_9_10_12.c
9 - 15
Inspect Lab_9.cmd
2. Open and inspect Lab_9.cmd. Notice that a section called dmaMemBufs is being linked to L4SARAM. This section links the destination buffer for the DMA transfer to a DMA accessible memory space.
9 - 16
10. Run the code in real-time mode using the GEL function: GEL Realtime Emulation Control Run_Realtime_with_Reset, and watch the memory window update. Verify that the ADC result buffer contains updated values. 11. Setup a dual-time graph of the filtered and unfiltered ADC results buffer. Click: View Graph Time/Frequency and set the following values: Display Type Dual Time
Start Address upper display AdcBufFiltered Start Address lower display AdcBuf Acquisition Buffer Size Display Data Size DSP Data Type Sampling Rate (Hz) Time Display Unit 48 48 32-bit floating-point 48000 s
12. The graphical display should show the generated FIR filtered 2 kHz, 25% duty cycle symmetric PWM waveform in the upper display and the unfiltered waveform in the lower display. You should see that the results match the previous lab exercise. 13. Fully halt the DSP (real-time mode) by using the GEL function: GEL Emulation Control Full_Halt. End of Exercise Realtime
9 - 17
9 - 18
System Design
Introduction
This module discusses various aspects of system design. Details of the emulation and analysis block along with JTAG will be explored. Flash memory programming and the Code Security Module will be described.
Learning Objectives
Learning Objectives
Emulation and Analysis Block External Interface (XINTF) Flash Configuration and Memory Performance Flash Programming Code Security Module (CSM)
10 - 1
Module Topics
Module Topics
System Design ...........................................................................................................................................10-1 Module Topics........................................................................................................................................10-2 Emulation and Analysis Block ...............................................................................................................10-3 External Interface (XINTF)....................................................................................................................10-6 Flash Configuration and Memory Performance ..................................................................................10-10 Flash Programming .............................................................................................................................10-13 Code Security Module (CSM) ..............................................................................................................10-15 Lab 10: Programming the Flash..........................................................................................................10-18
10 - 2
TMS320C2000
Some Available Emulators BlackHawk: USB2000 Olimex: TMS320-JTAG-USB Signum System: JTAGjet-TMS-C2000 Spectrum Digital: XDS-510LC These emulators are C2000 specific, and are much lower cost than emulators that support all TI DSP platforms (although those can certainly be used)
H E A D E R
TMS320C2000
SCAN OUT
GND
10 - 3
Debug Activity
Halt on a specified instruction (for debugging in Flash) A memory location is getting corrupted; halt the processor when any value is written to this location Halt program execution after a specific value is written to a variable Halt on a specified instruction only after some other specific routine has executed
2 Address Watchpoints
10 - 4
Bus selection
Stack grows towards higher memory addresses Monitor for data writes in region near the end of the stack Data Memory
10 - 5
0x000E00 reserved PF 0 (6Kw) 0x002000 0x004000 XINTF Zone 0 (4Kw) 0x005000 PF 3 (4Kw) 0x006000 PF 1 (4Kw) reserved 0x007000 PF 2 (4Kw) 0x008000 L0 SARAM (4Kw) 0x009000 L1 SARAM (4Kw) 0x00A000 L2 SARAM (4Kw) 0x00B000 L3 SARAM (4Kw) 0x00C000 L4 SARAM (4Kw) 0x00D000 L5 SARAM (4Kw) 0x00E000 L6 SARAM (4Kw) 0x00F000 L7 SARAM (4Kw) 0x010000
0x33FFF8 0x340000 0x380080 0x380090 0x380400 0x380800 0x3F8000 0x3F9000 0x3FA000 0x3FB000 0x3FC000 0x3FE000 0x3FFFC0 0x3FFFFF
PASSWORDS (8w)
reserved ADC calibration data reserved User OTP (1Kw) reserved L0 SARAM (4Kw) L1 SARAM (4Kw) L2 SARAM (4Kw) L3 SARAM (4Kw) reserved Boot ROM (8Kw)
BROM Vectors (64w)
Dual Mapped: L0, L1, L2, L3 CSM Protected: L0, L1, L2, L3, OTP FLASH, ADC CAL, Flash regs in PF0 DMA Accessible: L4, L5, L6, L7, XINTF Zone 0, 6, 7
Data
Program
TMS320F28335
XWE0 XRD XR/W XZCS0 XZCS6 XZCS7 XREADY XCLKOUT XHOLD XHOLDA
Zone selects
10 - 6
TMS320F28335
Low word
D(15:0) A(18:0)
TMS320F28335
Hi word
D(15:0) A(18:0)
16-bit SRAM
16-bit SRAM
16-bit SRAM
10 - 7
XINTF Timings
Three external zones: 0, 6, 7 Each zone has separate read and write timings XREADY signal can be used to extend ACTIVE phase
XRDLEAD XZCS XRD XA[19:0] XD[] SRAM ta(A) valid address valid data XRDACTIVE DSP latches data XRDTRAIL
Read Timing
XINTF Clocking
XTIMING0 XTIMING6 XTIMING7 XBANK
Lead/Active/Trail
C28x CPU
SYSCLKOUT
/2
1 0
XTIMCLK
/2
XINTCNF2.XTIMCLK
1 0
XINTCNF2.CLKOFF XCLKOUT
XINTCNF2.CLKMODE
Specify read timing and write timing separately, for each zone:
Lead: Trail: 1-3 XTIMCLK Cycles 0-3 XTIMCLK Cycles Active: 0-7 XTIMCLK Cycles
Each zone has a X2TIMING bit that can double the timing values (both read and write affected)
10 - 8
XINTF Registers
Name XTIMING0 XTIMING6 XTIMING7 XINTCNF2 XBANK XRESET Address 0x00 0B20 0x00 0B2C 0x00 0B2E 0x00 0B34 0x00 0B38 0x00 0B3D Size (x16) 2 2 2 2 1 1 Description XINTF Zone 0 Timing Register XINTF Zone 6 Timing Register XINTF Zone 7 Timing Register XINTF Configuration Register XINTF Bank Control Register XINTF Reset Register
XTIMINGx specifies read and write timings (lead, active, trail), interface size (16 or 32 bit), X2TIMING, XREADY usage XINTCNF2 selects SYSCLKOUT/1 or SYSCLKOUT/2 as fundamental clock speed XTIMCLK (for lead, active, trail), XHOLD control, write buffer control XBANK specifies the number of XTIMCLK cycles to add between two specified zone (bank switching) XRESET used to do a hard reset in case where CPU detects a stuck XREADY during a DMA transfer
Bank switching example: Suppose the external device in zone 7 is slow getting off the bus; Add 3 additional cycles when switching from zone 7 to another zone to avoid bus contention
XintfRegs.XBANK.bit.BANK = 7; XintfRegs.XBANK.bit.BCYC = 3; // Select Zone 7 // Add 3 XTIMCLK cycles
10 - 9
FlashRegs.FBANKWAIT
15
reserved
PAGEWAIT
reserved
4 3
RANDWAIT
0
FlashRegs.FOTPWAIT
reserved
OTPWAIT
*** Refer to the F2833x datasheet for detailed numbers *** For 150 MHz, PAGEWAIT = 5, RANDWAIT = 5, OTPWAIT = 8 For 100 MHz, PAGEWAIT = 3, RANDWAIT = 3, OTPWAIT = 5
FlashRegs.FOPT.bit.ENPIPE = 1;
15 1 0
reserved
ENPIPE
10 - 10
Notes
1 0.167
32-bit 16-bit
0.5 0.25
Internal RAM has best data performance put time critical data here External RAM can generally outperform the flash for data access, but increases cost and power consumption Flash performance usually sufficient for most constants and tables Note that the flash instruction fetch pipeline will also stall during a flash data access
10 - 11
FPWR: Save power by putting Flash/OTP to Sleep or Standby mode; Flash will automatically enter active mode if a Flash/OTP access is made FSTATUS: Various status bits (e.g. PWR mode) FSTDBYWAIT, FACTIVEWAIT: Specify # of delay cycles during wake-up from sleep to standby, and from standby to active, respectively. The delay is needed to let the flash stabilize. Leave these registers set to their default maximum value.
See the TMS320x2833x System Control and Interrupts Reference Guide, SPRUFB0, for more information
10 - 12
Flash Programming
Flash Programming
Emulator RS232
RAM
Flash Data
TMS320F2833x
Function
- Set all bits to zero, then to one - Program selected bits with zero - Verify flash contents
Minimum Erase size is a sector (32Kw or 16Kw) Minimum Program size is a bit! Important not to lose power during erase step: If CSM passwords happen to be all zeros, the CSM will be permanently locked! Chance of this happening is quite small! (Erase step is performed sector by sector)
10 - 13
Flash Programming
SDFlash Serial utility (uses SCI boot) Gang Programmers (use GPIO boot)
BP Micro programmer Data I/O programmer
10 - 14
Dual
0x300000 0x340000 0x380400 0x3F8000 0x3F9000 0x3FA000 0x3FB000 FLASH (256Kw)
128-Bit Password
Mapped
OTP (1Kw) L0 SARAM (4Kw) L1 SARAM (4Kw) L2 SARAM (4Kw) L3 SARAM (4Kw)
Data reads and writes from restricted memory are only allowed for code running from restricted memory All other data read/write accesses are blocked:
JTAG emulator/debugger, ROM bootloader, code running in external memory or unrestricted internal memory
CSM Password
0x300000
FLASH (256Kw)
0x33FFF8
128-Bit Password
128-bit user defined password is stored in Flash 128-bit KEY registers are used to lock and unlock the device
Mapped in memory space 0x00 0AE0 0x00 0AE7 Registers EALLOW protected
10 - 15
CSM Registers
Key Registers accessible by user; EALLOW protected Address Name Description 0x00 0AE0 KEY0 Low word of 128-bit Key register 0x00 0AE1 KEY1 2nd word of 128-bit Key register 0x00 0AE2 KEY2 3rd word of 128-bit Key register 0x00 0AE3 KEY3 4th word of 128-bit Key register 0x00 0AE4 KEY4 5th word of 128-bit Key register 0x00 0AE5 KEY5 6th word of 128-bit Key register 0x00 0AE6 KEY6 7th word of 128-bit Key register 0x00 0AE7 KEY7 High word of 128-bit Key register 0x00 0AEF CSMSCR CSM status and control register PWL in memory reserved for passwords only Address Name Description 0x33 FFF8 PWL0 Low word of 128-bit password 0x33 FFF9 PWL1 2nd word of 128-bit password 0x33 FFFA PWL2 3rd word of 128-bit password 0x33 FFFB PWL3 4th word of 128-bit password 0x33 FFFC PWL4 5th word of 128-bit password 0x33 FFFD PWL5 6th word of 128-bit password 0x33 FFFE PWL6 7th word of 128-bit password 0x33 FFFF PWL7 High word of 128-bit password
10 - 16
CSM Caveats
Never program all the PWLs as 0x0000
Doing so will permanently lock the CSM
Flash addresses 0x33FF80 to 0x33FFF5, inclusive, must be programmed to 0x0000 to securely lock the CSM Remember that code running in unsecured RAM cannot access data in secured memory
Dont link the stack to secured RAM if you have any code that runs from unsecured RAM
Start
Yes
Write password to KEY registers 0x00 0AE0 0x00 0AE7 (EALLOW) protected
Correct password? No
Yes
10 - 17
ADC
RESULT0
DMA
connector wire
ePWM2 triggering ADC on period match using SOC A trigger every 20.833 s (48 kHz)
Pointer rewind
Objective: Program system into Flash Memory Learn use of CCS Flash Plug-in DO NOT PROGRAM PASSWORDS
FIR Filter
Procedure
Project File
1. A project named Lab10.pjt has been created for this lab. Open the project by Open and look in C:\C28x\Labs\Lab10. All Build clicking on Project Options have been configured the same as the previous lab. The files used in this lab are: Adc_9_10_12.c Filter.c CodeStartBranch.asm Gpio.c DefaultIsr_9_10_12a.c Lab_10.cmd DelayUs.asm Main_10.c Dma.c PieCtrl_5_6_7_8_9_10.c DSP2833x_GlobalVariableDefs.c PieVect_5_6_7_8_9_10.c DSP2833x_Headers_nonBIOS.cmd SysCtrl.c ECap_7_8_9_10_12.c Watchdog.c EPwm_7_8_9_10_12.c
10 - 18
10 - 19
5. Open and inspect InitPieCtrl() in PieCtrl_5_6_7_8_9_10.c. Notice the memcpy() function used to initialize (copy) the PIE vectors. At the end of the file a structure is used to enable the PIE.
10 - 20
placed in the password locations to protect your code. We will not be using real passwords in the workshop. The CSM module also requires programming values of 0x0000 into flash addresses 0x33FF80 through 0x33FFF5 in order to properly secure the CSM. Both tasks will be accomplished using a simple assembly language file Passwords.asm. 11. Add Passwords.asm to the project. 12. Open and inspect Passwords.asm. This file specifies the desired password values (DO NOT CHANGE THE VALUES FROM 0xFFFF) and places them in an initialized section named passwords. It also creates an initialized section named csm_rsvd which contains all 0x0000 values for locations 0x33FF80 to 0x33FFF5 (length of 0x76). 13. Open Lab_10.cmd and notice that the initialized sections for passwords and csm_rsvd are linked to memories named PASSWORDS and CSM_RSVD, respectively.
10 - 21
Build Lab.out
17. At this point we need to build the project, but not have CCS automatically load it since CCS cannot load code into the flash! (the flash must be programmed). On the menu bar click: Option Customize and select the Program/Project CIO tab. Uncheck Load Program After Build. CCS has a feature that automatically steps over functions without debug information. This can be useful for accelerating the debug process provided that you are not interested in debugging the function that is being stepped-over. While single-stepping in this lab exercise we do not want to step-over any functions. Therefore, select the Debug Properties tab. Uncheck Step over functions without debug information when source stepping, then click OK. 18. Click the Build button to generate the Lab.out file to be used with the CCS Flash Plug-in.
20. A Clock Configuration window may open. If needed, in the Clock Configuration window set OSCCLK (MHz): to 30, DIVSEL: to /2, and PLLCR Value: to 10. Then click OK. In the next Flash Programmer Settings window confirm that the selected DSP device to program is F28335 and all options have been checked. Click OK. 21. Notice that the eZdsp board uses a 30 MHz oscillator (located on the board near LED DS1). Confirm the Clock Configuration in the upper left corner has the OSCCLK set to 30 MHz, the DIVSEL set to /2, and the PLLCR value set to 10. Recall that the PLL is divided by two, which gives a SYSCLKOUT of 150 MHz. 22. Confirm that all boxes are checked in the Erase Sector Selection area of the plug-in window. We want to erase all the flash sectors. 23. We will not be using the plug-in to program the Code Security Password. Do not modify the Code Security Password fields. 24. In the Operation block, notice that the COFF file to Program/Verify field automatically defaults to the current .out file. Check to be sure that Erase, Program, Verify is selected. We will be using the default wait states, as shown on the slide in this module. 25. Click Execute Operation to program the flash memory. Watch the programming status update in the plug-in window. 26. After successfully programming the flash memory, close the programmer window.
10 - 22
and select Lab10.out in the Debug folder. 28. Reset the DSP. The program counter should now be at 0x3FF9A9, which is the start of the bootloader in the Boot ROM. 29. Single-Step <F11> through the bootloader code until you arrive at the beginning of the codestart section in the CodeStartBranch.asm file. (Be patient, it will take about 125 single-steps). Notice that we have placed some code in CodeStartBranch.asm to give an option to first disable the watchdog, if selected. 30. Step a few more times until you reach the start of the C-compiler initialization routine at the symbol _c_int00. Go Main. The code should stop at the beginning of your main() 31. Now do Debug routine. If you got to that point succesfully, it confirms that the flash has been programmed properly, and that the bootloader is properly configured for jump to flash mode, and that the codestart section has been linked to the proper address. 32. You can now RUN the DSP, and you should observe the LED on the board blinking. Try resetting the DSP and hitting RUN (without doing all the stepping and the Go Main procedure). The LED should be blinking again.
10 - 23
Position 4 Position 3 Position 2 Position 1 GPIO87 GPIO86 GPIO85 GPIO84 Right 0 Left 1 Right 0 Right 0
End of Exercise
10 - 24
FLASH
length = 0x3FF80 page = 0
Lab_10.cmd
SECTIONS {
0x33 FF80
CSM_RSVD
length = 0x76 page = 0
codestart csm_rsvd }
0x33 FFF6
BEGIN_FLASH
length = 0x2 page = 0
0x33 FFF8
PASSWORDS
length = 0x8 page = 0
FLASH (256Kw)
rts2800_ml.lib
0x33 7FF6
0x3F F000
RESET
10 - 25
10 - 26
Communications
Introduction
The TMS320C28x contains features that allow several methods of communication and data exchange between the C28x and other devices. Many of the most commonly used communications techniques are presented in this module. The intent of this module is not to give exhaustive design details of the communication peripherals, but rather to provide an overview of the features and capabilities. Once these features and capabilities are understood, additional information can be obtained from various resources such as documentation, as needed. This module will cover the basic operation of the communication peripherals, as well as some basic terms and how they work.
Learning Objectives
Learning Objectives
Serial Peripheral Interface (SPI) Serial Communication Interface (SCI) Multichannel Buffered Serial Port (McBSP) Inter-Integrated Circuit (I2C) Enhanced Controller Area Network (eCAN)
Note: Up to 1 SPI module (A), 3 SCI modules (A/B/C), 2 McBSP modules (A/B), 1 I2C module (A), and 2 eCAN modules (A/B) are available on the F2833x devices.
11 - 1
Module Topics
Module Topics
Communications.......................................................................................................................................11-1 Module Topics........................................................................................................................................11-2 Communications Techniques .................................................................................................................11-3 Serial Peripheral Interface (SPI) ...........................................................................................................11-4 SPI Registers .....................................................................................................................................11-7 SPI Summary.....................................................................................................................................11-8 Serial Communications Interface (SCI) .................................................................................................11-9 Multiprocessor Wake-Up Modes.....................................................................................................11-11 SCI Registers ...................................................................................................................................11-14 SCI Summary ..................................................................................................................................11-15 Multichannel Buffered Serial Port (McBSP) .......................................................................................11-16 Inter-Integrated Circuit (I2C)..............................................................................................................11-19 I2C Operating Modes and Data Formats .........................................................................................11-20 I2C Summary...................................................................................................................................11-21 Enhanced Controller Area Network (eCAN) .......................................................................................11-22 CAN Bus and Node .........................................................................................................................11-23 Principles of Operation....................................................................................................................11-24 Message Format and Block Diagram...............................................................................................11-25 eCAN Summary ..............................................................................................................................11-26
11 - 2
Communications Techniques
Communications Techniques
Several methods of implementing a TMS320C28x communications system are possible. The method selected for a particular design should reflect the method that meets the required data rate at the lowest cost. Various categories of interface are available and are summarized in the learning objective slide. Each will be described in this module.
Asynchronous
longer distances Lower data rate ( 1/8 of SPI) Implied clock (clk/data mixed) Economical with reasonable performance
C28x Port U2
C28x
Port Destination
PCB
PCB
Serial ports provide a simple, hardware-efficient means of high-level communication between devices. Like the GPIO pins, they may be used in stand-alone or multiprocessing systems. In a multiprocessing system, they are an excellent choice when both devices have an available serial port and the data rate requirement is relatively low. Serial interface is even more desirable when the devices are physically distant from each other because the inherently low number of wires provides a simpler interconnection. Serial ports require separate lines to implement, and they do not interfere in any way with the data and address lines of the processor. The only overhead they require is to read/write new words from/to the ports as each word is received/transmitted. This process can be performed as a short interrupt service routine under hardware control, requiring only a few cycles to maintain. The C28x family of devices have both synchronous and asynchronous serial ports. Detailed features and operation will be described next.
11 - 3
In its simplest form, the SPI can be thought of as a programmable shift register. Data is shifted in and out of the SPI through the SPIDAT register. Data to be transmitted is written directly to the SPIDAT register, and received data is latched into the SPIBUF register for reading by the CPU. This allows for double-buffered receive operation, in that the CPU need not read the current received data from SPIBUF before a new receive operation can be started. However, the CPU must read SPIBUF before the new operation is complete of a receiver overrun error will occur. In addition, double-buffered transmit is not supported: the current transmission must be complete before the next data character is written to SPIDAT or the current transmission will be corrupted. The Master can initiate a data transfer at any time because it controls the SPICLK signal. The software, however, determines how the Master detects when the Slave is ready to broadcast.
11 - 4
RX FIFO_15 SPIRXBUF.15-0
MSB
SPIDAT.15-0
LSB
SPISOMI
SPITXBUF.15-0 TX FIFO_0
11 - 5
Since data is shifted out of the SPIDAT register MSB first, transmission characters of less than 16 bits must be left-justified by the CPU software prior to be written to SPIDAT. Received data is shifted into SPIDAT from the left, MSB first. However, the entire sixteen bits of SPIDAT is copied into SPIBUF after the character transmission is complete such that received characters of less than 16 bits will be right-justified in SPIBUF. The non-utilized higher significance bits must be masked-off by the CPU software when it interprets the character. For example, a 9 bit character transmission would require masking-off the 7 MSBs.
SPIDAT - Processor #1
11001001XXXXXXXX 11001001XXXXXXXX
Received data of less than 16 bits are right justified User software must mask-off unused MSBs
SPIDAT - Processor #2
XXXXXXXX11001001 XXXXXXXX11001001
11 - 6
SPI Registers
LSPCLK (SPIBRR + 1)
SPIBRR = 3 to 127
SPICLK signal =
LSPCLK 4 , SPIBRR = 0, 1, or 2
Baud Rate Determination: The Master specifies the communication baud rate using its baud rate register (SPIBRR.6-0):
For SPIBRR = 0, 1, or 2:
From the above equations, one can compute Maximum data rate = 25 Mbps @ 100 MHz Character Length Determination: The Master and Slave must be configured for the same transmission character length. This is done with bits 0, 1, 2 and 3 of the configuration control register (SPICCR.3-0). These four bits produce a binary number, from which the character length is computed as binary + 1 (e.g. SPICCR.3-0 = 0010 gives a character length of 3).
11 - 7
Status SpixRegs.SPIST
RX Overrun Flag, Interrupt Flag, TX Buffer Full Flag
SPI Summary
SPI Summary
Synchronous serial communications
Two wire transmit or receive (half duplex) Three wire transmit and receive (full duplex)
Data length programmable from 1-16 bits 125 different programmable baud rates
11 - 8
SCIRXD
SCIRXD
RX FIFO_15
RX FIFO_15
SCI Device #1
SCI Device #2
11 - 9
0 = Disabled 1 = Enabled
0 = Disabled 1 = Enabled
The basic unit of data is called a character and is 1 to 8 bits in length. Each character of data is formatted with a start bit, 1 or 2 stop bits, an optional parity bit, and an optional address/data bit. A character of data along with its formatting bits is called a frame. Frames are organized into groups called blocks. If more than two serial ports exist on the SCI bus, a block of data will usually begin with an address frame which specifies the destination port of the data as determined by the users protocol. The start bit is a low bit at the beginning of each frame which marks the beginning of a frame. The SCI uses a NRZ (Non-Return-to-Zero) format which means that in an inactive state the SCIRX and SCITX lines will be held high. Peripherals are expected to pull the SCIRX and SCITX lines to a high level when they are not receiving or transmitting on their respective lines. When configuring the SCICCR, the SCI port should first be held in an inactive state. This is done using the SW RESET bit of the SCI Control Register 1 (SCICTL1.5). Writing a 0 to this bit initializes and holds the SCI state machines and operating flags at their reset condition. The SCICCR can then be configured. Afterwards, re-enable the SCI port by writing a 1 to the SW RESET bit. At system reset, the SW RESET bit equals 0.
11 - 10
Start Bit
Falling Edge Detected
LSB of Data
11 - 11
Block of Frames
SP ST
Data
SP ST Last Data
SP
ST
Addr
SP
Idle Address frame Period follows 10 bit 10 bits or greater idle or greater
SCIRXD/ SCITXD
Last Data 0
SP
ST
Addr
1 SP ST
Data
0 SP ST Last Data 0 SP ST
Addr
1 SP
First frame within 1st data frame block is Address. ADDR/DATA bit set to 1
11 - 12
The SCI interrupt logic generates interrupt flags when it receives or transmits a complete character as determined by the SCI character length. This provides a convenient and efficient way of timing and controlling the operation of the SCI transmitter and receiver. The interrupt flag for the transmitter is TXRDY (SCICTL2.7), and for the receiver RXRDY (SCIRXST.6). TXRDY is set when a character is transferred to TXSHF and SCITXBUF is ready to receive the next character. In addition, when both the SCIBUF and TXSHF registers are empty, the TX EMPTY flag (SCICTL2.6) is set. When a new character has been received and shifted into SCIRXBUF, the RXRDY flag is set. In addition, the BRKDT flag is set if a break condition occurs. A break condition is where the SCIRXD line remains continuously low for at least ten bits, beginning after a missing stop bit. Each of the above flags can be polled by the CPU to control SCI operations, or interrupts associated with the flags can be enabled by setting the RX/BK INT ENA (SCICTL2.1) and/or the TX INT ENA (SCICTL2.0) bits active high. Additional flag and interrupt capability exists for other receiver errors. The RX ERROR flag is the logical OR of the break detect (BRKDT), framing error (FE), receiver overrun (OE), and parity error (PE) bits. RX ERROR high indicates that at least one of these four errors has occurred during transmission. This will also send an interrupt request to the CPU if the RX ERR INT ENA (SCICTL1.6) bit is set.
11 - 13
SCI Registers
Baud Rate Determination: The values in the baud-select registers (SCIHBAUD and SCILBAUD) concatenate to form a 16 bit number that specifies the baud rate for the SCI. For BRR = 1 to 65535: SCI Baud Rate =
For BRR = 0:
LSPCLK bits/sec 16
Max data rate = 6.25 Mbps @ 100 MHz Note that the CLKOUT for the SCI module is one-half the CPU clock rate.
11 - 14
Control 2 ScixRegs.SPICTL2
TX Buffer Full / Empty Flag, TX Ready Interrupt Enable RX Break Interrupt Enable
SCI Summary
SCI Summary
Asynchronous communications format 65,000+ different programmable baud rates Two wake-up multiprocessor modes
Idle-line wake-up & Address-bit wake-up
Transmit FIFO and receive FIFO Individual interrupts for transmit and receive
11 - 15
MFSXx MCLKXx
MDXx
CPU
MDRx
MCLKRx MFSRx
Bit - one data bit per serial clock period Word or channel contains number of bits (8, 12, 16, 20, 24, 32)
11 - 16
FS D w6 w7 Word w0 w1 w2 w3 w4 w5 w6 w7 Frame
Frame - contains one or multiple words Number of words per frame: 1-128
Multi-Channel Selection
Frame TDM Bit Stream Ch31 Ch31 Multi-channel Ch0-0 Ch0-1 Ch5-0 Ch5-1 Ch27-0 Ch27-1
C O D E C
0 1
... ...
Ch1 Ch1
Ch0 Ch0
M c B S P
Allows multiple channels (words) to be independently selected for transmit and receive (e.g. only enable Ch0, 5, 27 for receive, then process via CPU) The McBSP keeps time sync with all channels, but only listens or talks if the specific channel is enabled (reduces processing/bus overhead) Multi-channel mode controlled primarily via two registers:
Multi-channel Control Reg Rec/Xmt Channel Enable Regs
MCR
(enables Mc-mode)
R/XCER (A-H)
(enable/disable channels)
11 - 17
McBSP Summary
Independent clocking and framing for transmit and receive Internal or external clock and frame sync Data size of 8, 12, 16, 20, 24, or 32 bits TDM mode - up to 128 channels
Used for T1/E1 interfacing
-law and A-law companding SPI mode Direct Interface to many codecs Can be serviced by the DMA
11 - 18
..
..
. . . . . . . .
I2C EPROM 28xx I2C
28xx I2C
I2C Controller
I2CXSR
I2CDXR TX FIFO
SCL
Clock Circuits
11 - 19
Slave Address
R/W ACK
Data
ACK
Data
ACK P
11110AA
Data
ACK P
Data
ACK
Data
ACK
Data
ACK P
R/W = 0 master writes data to addressed slave R/W = 1 master reads data from the slave n = 1 to 8 bits S = Start (high-to-low transition on SDA while SCL is high) P = Stop (low-to-high transition on SDA while SCL is high)
11 - 20
I2C Arbitration
Arbitration procedure invoked if two or more mastertransmitters simultaneously start transmission
Procedure uses data presented on serial data bus (SDA) by competing transmitters First master-transmitter which drives SDA high is overruled by another master-transmitter that drives SDA low Procedure gives priority to the data stream with the lowest binary value
Device #1 lost arbitration and switches to slavereceiver mode Device #2 drives SDA
I2C Summary
I2C Summary
Compliance with Philips I2C-bus specification (version 2.1) 7-bit and 10-bit addressing modes Configurable 1 to 8 bit data words Data transfer rate from 10 kbps up to 400 kbps Transmit FIFO and receive FIFO
11 - 21
CAN does not use physical addresses to address stations. Each message is sent with an identifier that is recognized by the different nodes. The identifier has two functions it is used for message filtering and for message priority. The identifier determines if a transmitted message will be received by CAN modules and determines the priority of the message when two or more nodes want to transmit at the same time.
11 - 22
CAN Bus
Two wire differential bus (usually twisted pair) Max. bus length depend on transmission rate
40 meters @ 1 Mbps
CAN NODE A
CAN_H 120 CAN_L
CAN NODE B
CAN NODE C
120
The DSP communicates to the CAN Bus using a transceiver. The CAN bus is a twisted pair wire, and the transmission rate depends on the bus length. If the bus is less than 40 meters the transmission rate is capable up to 1 Mbit/second.
CAN Node
Wired-AND Bus Connection
CAN_H 120 CAN_L 120
TX
11 - 23
Principles of Operation
Principles of Operation
Data messages transmitted are identifier based, not address based Content of message is labeled by an identifier that is unique throughout the network
(e.g. rpm, temperature, position, pressure, etc.)
All nodes on network receive the message and each performs an acceptance test on the identifier If message is relevant, it is processed (received); otherwise it is ignored Unique identifier also determines the priority of the message
(lower the numerical value of the identifier, the higher the priority)
When two or more nodes attempt to transmit at the same time, a non-destructive arbitration technique guarantees messages are sent in order of priority and no messages are lost
Start Bit
Node A Node B Node C CAN Bus Node B loses arbitration Node C loses arbitration
11 - 24
08 Bytes Data
CRC
ACK
08 Bytes Data
CRC
ACK
The DSP CAN module is a full CAN Controller. It contains a message handler for transmission and reception management, and frame storage. The specification is CAN 2.0B Active that is, the module can send and accept standard (11-bit identifier) and extended frames (29-bit identifier).
Address 32
Data
32
32
32
Receive Buffer Transmit Buffer Control Buffer Status Buffer
CAN Bus
11 - 25
The CAN controller module contains 32 mailboxes for objects of 0 to 8-byte data lengths: configurable transmit/receive mailboxes configurable with standard or extended indentifier
The CAN module mailboxes are divided into several parts: MID contains the identifier of the mailbox MCF (Message Control Field) contains the length of the message (to transmit or receive) and the RTR bit (Remote Transmission Request used to send remote frames) MDL and MDH contains the data
The CAN module contains registers which are divided into five groups. These registers are located in data memory from 0x006000 to 0x0061FF. The five register groups are: Control & Status Registers Local Acceptance Masks Message Object Time Stamps Message Object Timeout Mailboxes
eCAN Summary
eCAN Summary
Fully compliant with CAN standard v2.0B Supports data rates up to 1 Mbps Thirty-two mailboxes
Configurable as receive or transmit Configurable with standard or extended identifier Programmable receive mask Uses 32-bit time stamp on messages Programmable interrupt scheme (two levels) Programmable alarm time-out
11 - 26
DSP/BIOS
Introduction
This module discusses the basic features of using DSP/BIOS in a system. Scheduling threads, periodic functions, and the use of real-time analysis tools will be demonstrated, in addition to programming the flash with DSP/BIOS.
Learning Objectives
Learning Objectives
Introduction to DSP/BIOS DSP/BIOS Configuration Tool Scheduling DSP/BIOS threads Periodic Functions Real-time Analysis Tools Flash Programming with DSP/BIOS
12 - 1
Module Topics
Module Topics
DSP/BIOS..................................................................................................................................................12-1 Module Topics........................................................................................................................................12-2 Introduction to DSP/BIOS .....................................................................................................................12-3 DSP/BIOS Configuration Tool...............................................................................................................12-4 Lab 12a: DSP/BIOS Configuration Tool ...............................................................................................12-9 Scheduling DSP/BIOS Threads............................................................................................................12-15 Periodic Functions...............................................................................................................................12-20 Real-time Analysis Tools......................................................................................................................12-21 Lab 12b: DSP/BIOS.............................................................................................................................12-22 DSP/BIOS and Programming the Flash ..............................................................................................12-34 Lab 12c: Flash Programming with DSP/BIOS ....................................................................................12-35
12 - 2
Introduction to DSP/BIOS
Introduction to DSP/BIOS
What is DSP/BIOS?
A full-featured, scalable real-time kernel
System configuration tools Preemptive multi-threading scheduler Real-time analysis tools
12 - 3
Real-Time Scheduler
Preemptive tread manager kernel configures DSP/BIOS scheduling
Real-Time I/O
Allows two way communication between threads or between target and PC host
The GUI (graphical user interface) simplifies system design by: Automatically including the appropriate runtime support libraries Automatically handles interrupt vectors and system reset Handles system memory configuration (builds .cmd file) When a .tcf file is saved, the Config Tool generates 5 additional files: Filename.tcf Filenamecfg_c.c Filenamecfg.s28 Filenamecfg.cmd Filenamecfg.h Filenamecfg.h28 Text Configuration File C code created by Config Tool ASM code created by Config Tool Linker command file header file for *cfg_c.c header file for *cfg.s28
When you add a .tcf file to your project, CCS automatically adds the C and assembly (.s28) files and the linker command file (.cmd) to the project under the Generated Files folder.
12 - 4
12 - 5
12 - 6
12 - 7
This file contains two main parts, MEMORY and SECTIONS. (Though, if you open and examine it, its not quite as nicely laid out as shown above.) Running the Linker The linkers main purpose is to link together various object files. It combines like-named input sections from the various object files and places each new output section at specific locations in memory. In the process, it resolves (provides actual addresses for) all of the symbols described in your code. The linker can create two outputs, the executable (.out) file and a report which describes the results of linking (.map). Note: The linker gets run automatically when you BUILD or REBUILD your project.
12 - 8
F28335
System Description:
TMS320F28335 All internal memory blocks allocated
Procedure
Project File
1. A project named Lab12a.pjt has been created for this lab. Open the project by Open and look in C:\C28x\Labs\Lab12a. The lab clicking on Project files from module 9 will be used as a starting point for the lab exercises in this DSP/BIOS module. All Build Options have been configured the same as the previous lab. The files used in this lab are:
12 - 9
A dialog box appears. The TCF files shown in the aforementioned dialog box are called seed TCF files. TCF files are used to configure many objects specific to the processor.
12 - 10
On the 2xxx tab select the ti.platforms.ezdsp28335 template and click OK. A configuration window will open. 8. Save the configuration file by selecting: File Save As
and name it Lab.tcf in C:\C28x\Labs\Lab12a then click Save. Close the configuration window and select YES to save changes to Lab.tcf. 9. Add the configuration file to the project. Click: Project Add Files to Project
Make sure youre looking in C:\C28x\Labs\Lab12a. Change the files of type to view All Files (*.*) and select Lab.tcf. Click OPEN to add the file to the project. 10. In the project window left click the plus sign (+) to the left of DSP/BIOS Config. Notice that the Lab.tcf file is listed. 11. Next, add the generated linker command file Labcfg.cmd to the project. After the file has been added you will notice that it is listed under the source files.
12 - 11
Memory
Base
Length 0x0002
Space code
0x3F EBDC 0x06A0 code 0x3F E000 0x3F EB50 0x0B50 0x008C code code
16. Modify the base addresses, length, and space of each of the memory sections to avoid memory conflicts with the newly added memory sections as shown in the table below. Memory BOOTROM MSARAM Base 0x3F F37C 0x00 0002 Length 0x0D44 0x07FE Space code data
17. Next, modify the space setting for L03SARAM to be code and the space setting for L47SARAM to be data. 18. Right click on MEM Memory Section Manager and select Properties. Select the Compiler Sections tab and notice that defined sections have been linked into the appropriate memories via the pull-down boxes. The .stack section has been linked into memory using the BIOS Data tab. The default settings are sufficient for getting started. Click OK to close the Properties window.
12 - 12
23. Modify the configuration file Lab.tcf to setup the PIE vector for the watchdog interrupt. Click on the plus sign (+) to the left of Scheduling and again on the plus sign (+) to the left of HWI Hardware Interrupt Service Routine Manager. Click the plus sign (+) to the left of PIE INTERRUPTS. Locate the interrupt location for the watchdog at PIE_INT1_8. Right click, select Properties, and type _WAKEINT_ISR (with a leading underscore) in the function field. Click OK to save. 24. Setup the PIE vector for the ADC interrupt. Locate the interrupt location for the ADC at PIE_INT1_6. Right click, select Properties, and type _ADCINT_ISR (with a leading underscore) in the function field. Click OK to save. 25. Setup the PIE vector for the ECAP1 interrupt. Locate the interrupt location for the ECAP1 at PIE_INT4_1. Right click, select Properties, and type _ECAP1_INT_ISR (with a leading underscore) in the function field. Click OK to save. 26. Setup the PIE vector for the DMA channel 1 interrupt. Locate the interrupt location for the DMA channel 1 at PIE_INT7_1. Right click, select Properties, and type _DINTCH1_ISR (with a leading underscore) in the function field. Click OK to save. Close the configuration window and select YES to save changes to Lab.tcf.
30. Open and setup a dual time graph to plot a 48-point window of the filtered and unfiltered ADC results buffer. Click: View Graph Time/Frequency and set the following values:
12 - 13
Display Type
Dual Time
Start Address upper display AdcBufFiltered Start Address lower display AdcBuf Acquisition Buffer Size Display Data Size DSP Data Type Sampling Rate (Hz) Time Display Unit Select OK to save the graph options. (Note: the math type in IQmathlib.h has been defined as floating-point in the previous lab exercise). 31. Run the code in real-time mode using the GEL function: GEL Realtime Emulation Control Run_Realtime_with_Reset, and watch the graphical display update. 32. The graphical display should show the generated FIR filtered 2 kHz, 25% duty cycle symmetric PWM waveform in the upper display and the unfiltered waveform in the lower display. Confirm that the results match the Lab 9 exercise. 33. Fully halt the DSP (real-time mode) by using the GEL function: GEL Emulation Control Full_Halt. End of Exercise Realtime 48 48 32-bit floating-point 48000 s
12 - 14
SWI Priority
Software Interrupts
TSK
Tasks
IDL
Background
BIOS will enable global interrupts for you Must delete the endless loop at end of main()
main() returns to BIOS and goes to the IDLE thread, allowing BIOS to schedule events, transfer data to the host, etc. An endless loop in main() will keep BIOS from running
12 - 15
12 - 16
SWI Properties
12 - 17
HWI 1 post1 SWI 3 int2 SWI 2 rtn SWI 1 rtn MAIN int1 IDLE
(lowest)
rtn rtn
12 - 18
SWI
SWI_post start
must run to completion
TSK end
SEM_pend
SEM_post
start end
Similar to hardware interrupt, but triggered by SWI_post() SWIs must run to completion All SWI's use system stack faster context switching smaller code size
SEM_post() readies the TSK which pends on an event TSKs can be terminated by S/W Each TSK has its own stack slower context switching larger code size
12 - 19
Periodic Functions
Periodic Functions
Periodic functions are a special type of SWI that are triggered by DSP/BIOS Periodic functions run at a user specified rate: - e.g. LED blink requires 0.5 Hz Use the CLK Manager to specify the DSP/BIOS CLK rate in microseconds per tick Use the PRD Manager to specify the period (for the function) in ticks Allows multiple periodic functions with different rates
12 - 20
Execution Graph
Software logic analyzer Debug event timing and priority
Message LOG
Send debug msgs to host Doesnt halt the DSP Deterministic, low DSP cycle count More efficient than traditional printf()
12 - 21
ADC
RESULT0
DMA
connector wire
ePWM2 triggering ADC on period match using SOC A trigger every 20.833 s (48 kHz)
Objective: Change DMA DINTCH1_ISR HWI to SWI Replace LED blink routine with a Periodic Function
FIR Filter
Pointer rewind
It will be interesting to investigate the DSP computational burden of the various parts of our application, as well as the different pieces of DSP/BIOS that we will be using in this lab. The CPU Load Graph feature of DSP/BIOS will provide a quick and easy method for doing this. We will be tabulating these results in the table that follows at various steps throughout the remainder of this lab.
12 - 22
4 5 6 7
12 - 23
Procedure
Project File
1. A project named Lab12b.pjt has been created for this lab. Open the project by clicking on Project Open and look in C:\C28x\Labs\Lab12b. All Build Options have been configured the same as the previous lab. The files used in this lab are: Adc_9_10_12.c Filter.c CodeStartBranch.asm Gpio.c DefaultIsr_12b.c Lab.tcf DelayUs.asm Lab_12a_12b.cmd Dma.c Labcfg.cmd DSP2833x_GlobalVariableDefs.c Main_12b.c DSP2833x_Headers_BIOS.cmd PieCtrl_12.c ECap_7_8_9_10_12.c SysCtrl.c EPwm_7_8_9_10_12.c Watchdog.c
12 - 24
6. We will be running our code in real-time mode, and will have our window continuously refresh. Run in Real-time Mode using the GEL function: GEL Realtime Emulation Control Run_Realtime_with_Reset. Note: For the next step, check to be sure that the jumper wire connecting PWM1A (pin # P8-9) to ADCINA0 (pin # P9-2) is still in place on the eZdsp. 7. Open and setup a dual time graph to plot a 48-point window of the filtered and unfiltered ADC results buffer. Click: View Graph Time/Frequency and set the following values: Display Type Dual Time
Start Address upper display AdcBufFiltered Start Address lower display AdcBuf Acquisition Buffer Size Display Data Size DSP Data Type Sampling Rate (Hz) Time Display Unit Select OK to save the graph options. 8. The graphical display should show the generated FIR filtered 2 kHz, 25% duty cycle symmetric PWM waveform in the upper display and the unfiltered waveform in the lower display. The results should be the same as the previous lab. Fully halt the DSP (real-time mode) by using the GEL function: GEL Emulation Control Full_Halt. Realtime 48 48 32-bit floating-point 48000 s
9. Open the RTA Control Panel by clicking DSP/BIOS RTA Control Panel. Uncheck ALL of the boxes. This disables most of the realtime analysis tools. We will selectively enable them later in the lab. 10. Open the CPU Load Graph by clicking DSP/BIOS CPU Load Graph. The CPU load graph displays the percentage of available CPU computing horsepower that the application is consuming. The CPU may be running ISRs, software interrupts, periodic functions, performing I/O with the host, or running any user routine. When the CPU is not executing user code, it will be idle (in the DSP/BIOS idle thread). Run the code (real-time mode) by using the GEL function: GEL Emulation Control Run_Realtime_with_Reset. Realtime
12 - 25
This graph should start updating, showing the percentage load on the DSP CPU. Keep the DSP running to complete steps 11 through 15. 11. Open and inspect Main_12b.c. Notice that the global variable DEBUG_FILTER is used to control the FIR filter in DINTCH1_ISR(). If DEBUG_FILTER = 1, the FIR filter is called and the AdcBufFilter array is filled with the filtered data. On the other hand, if DEBUG_FILTER = 0, the filter is not called and the AdcBufFilter array is filled with the unfiltered data. 12. Open the watch window and add the variable DEBUG_FILTER to it. Change its value to 0 to turn off the FIR filtering. Notice the decrease in the CPU Load Graph. 13. Record the value shown in the CPU Load Graph under Case #1 in Table 12-1. 14. Change the value of DEBUG_FILTER back to 1 in the watch window in order to bring the FIR filter back online. Notice the jump in the CPU Load Graph. 15. Record the value shown in the CPU Load Graph under Case #2 in Table 12-1. 16. Fully halt the DSP (real-time mode) by using the GEL function: GEL Emulation Control Full_Halt. Realtime
Create a SWI
17. Open Main_12b.c and notice that space has been added at the end of main() for two new functions which will be used in this lab Dma1Swi() and LedBlink(). (Space has also been provided for AdcSwi() for the optional exercise). In the next few steps, we will move part of the DINTCH1_ISR() routine from DefaultIsr_12b.c to this space in Main_12b.c. 18. Open DefaultIsr_12b.c and locate the DINTCH1_ISR() routine. Move the entire contents of the DINTCH1_ISR() routine to the Dma1Swi() function in Main_12b.c with the following exceptions: DO NOT MOVE: The instruction used to acknowledge the PIE group interrupt The static local variable declaration of GPIO32_count The GPIO pin toggle code / LED toggle code
Be sure to move all of the other static local variable declaration at the top of DINTCH1_ISR() that is used to index into the ADC buffers. (Do not move the static local variable declaration of GPIO32_count). Comment: In almost all appplications, the PIE group acknowledge code is left in the HWI (rather than move it to a SWI). This allows other interrupts to occur on that PIE group even if the SWI has not yet executed. On the other hand, we are leaving the GPIO and
12 - 26
LED toggle code in the HWI just as an example. It illustrates that you can post a SWI and also do additional operations in the HWI. DSP/BIOS is extremely flexible! 19. Delete the interrupt key word from the DINTCH1_ISR. The interrupt keyword is not used when a HWI is under DSP/BIOS control. A HWI is under DSP/BIOS control when it uses any DSP/BIOS functionality, such as posting a SWI, or calling any DSP/BIOS function or macro.
Post a SWI
20. In DefaultIsr_12b.c add the following SWI_post to the DINTCH1_ISR(), just after the structure used to acknowledge the PIE group:
SWI_post(&DMA1_swi); // post a SWI
This posts a SWI that will execute the DMA1_swi() code you populated a few steps back in the lab. In other words, the DMA1 interrupt still executes the same code as before. However, most of that code is now in a posted SWI that DSP/BIOS will execute according to the specified scheduling priorities. Save and close the modified files.
12 - 27
Now delete from the DINTCH1_ISR() the code used to implement the interval counter for the LED toggle (i.e., the GPIO32_count++ loop), and also delete the declaration of the GPIO32_count itself from the beginning of DINTCH1_ISR(). These are no longer needed, as DSP/BIOS will implement the interval counter for us in the periodic function configuration (next step in the lab). Save and close the modified files. 31. In the configuration file Lab.tcf we need to add and setup the LedBlink_PRD. Open Lab.tcf and click on the plus sign (+) to the left of Scheduling. Right click on PRD Periodic Function Manger and select Insert PRD. Rename PRD0 to LedBlink_PRD and click OK. Select the Properties for LedBlink_PRD and type _LedBlink (with a leading underscore) in the function field. This tells DSP/BIOS to run the LedBlink() function when it executes the LedBlink_PRD periodic function object. Next, in the period (ticks) field type 500. The default DSP/BIOS system timer increments every 1 millisecond, so what we are doing is telling the DSP/BIOS scheduler to schedule the LedBlink() function to execute every 500 milliseconds. A PRD object is just a special type of SWI which gets scheduled periodically and runs in the context of the SWI level at a specified SWI priority. Click OK. Close the configuration file and click YES to save changes.
12 - 28
/*** Using LOG_printf() to write to a log buffer ***/ LOG_printf(&trace, "LedSwiCount = %u", LedSwiCount++);
12 - 29
37. In the configuration file Lab.tcf we need to add and setup the trace buffer. Open Lab.tcf and click on the plus sign (+) to the left of Instrumentation and again on the plus sign (+) to the left of LOG Event Log Manager. 38. Right click on LOG Event Log Manager and select Insert LOG. Rename LOG0 to trace and click OK. 39. Select the Properties for trace and confirm that the logtype is set to circular and the datatype is set to printf. Click OK. Close the configuration file and click YES to save changes. 40. Since the configuration file was modified, we need to rebuild the project. Click the Build button.
The message log dialog box is displaying the commanded LOG_printf() output, i.e. the number of times (count value) that the LedSwi() has executed. 43. Verify that all the check boxes in the RTA Control Panel window are still unchecked (from step 9). Then, check the box marked Global Host Enable. This is the main control switch for most of the RTA tools. We will be selectively enabling the rest of the check boxes in this portion of the exercise. 44. Record the value shown in the CPU Load Graph under Case #5 in Table 12-1. 45. Open the Execution Graph. On the menu bar, click: DSP/BIOS Execution Graph
Presently, the execution graph is not displaying anything. This is because we have it disabled in the RTA Control Panel. In the RTA Control Panel, check the top four boxes to enable logging of all event types to the execution graph. Notice that the Execution Graph is now displaying information about the execution threads being taken by your software. This graph is not based on time, but the activity of events (i.e. when an event happens, such as a SWI or periodic function begins execution). Notice that the execution graph simply records DSP/BIOS CLK events along with other system events (the DSP/BIOS clock periodically triggers the DSP/BIOS scheduler). As a result, the time scale on the execution graph is not linear. The logging of events to the execution graph consumes CPU cycles, which is why the CPU Load Graph jumped as you enabled logging.
12 - 30
46. Record the value shown in the CPU Load Graph under Case #6 in Table 12-1. 47. Open the Statistics View window. On the menu bar, click: DSP/BIOS Statistics View
Presently, the statistics view window is not changing with the exception of the statistics for the IDL_busyObj row (i.e., the idle loop). This is because we have it disabled in the RTA Control Panel. In the RTA Control Panel, check the next five boxes (i.e., those with the word Accumulator in their description) to enable logging of statistics to the statistics view window. The logging of statistics consumes CPU cycles, which is why the CPU Load Graph jumped as you enabled logging. 48. Record the value shown in the CPU Load Graph under Case #7 in Table 12-1. 49. Table 12-1 should now be completely filled in. Think about the results. Your instructor will discuss them when the lecture starts again. 50. Fully halt the DSP (real-time mode) by using the GEL function: GEL Emulation Control Full_Halt. Realtime
Note: In this lab exercise only the basic features of DSP/BIOS and the real-time analysis tools have been used. For more information and details, please refer to the DSP/BIOS users manuals and other DSP/BIOS related training. End of Exercise
Optional Exercise:
Modify the lab to service the ADC without using the DMA as it was done in the Lab 8 exercise. Remove the call to the InitDma() function and enable the interrupts in the Adc.c file. Then use DSP/BIOS to convert the ADCINT_ISR HWI to SWI. Recalculate the CPU computational burden servicing the ADC without using the DMA. A. In Main_12b.c comment out the code used to call the InitDma() function. B. In ADC_9_10_12.c uncomment the code used to enable the ADC interrupt. The ADC will now trigger the interrupt rather than the DMA. C. In DefaultIsr_12b.c locate the ADCINT_ISR() routine. Move the entire contents of the ADCINT_ISR() routine to the AdcSwi() function in Main_12b.c with the following exceptions: Do Not Move the instruction used to acknowledge the PIE group interrupt, the static local variable declaration of GPIO32_count, and the GPIO pin toggle code / LED toggle code. Be sure to move the other static local variable declaration at the top of ADCINT_ISR() that is used to index into the ADC buffers.
12 - 31
D. In DefaultIsr_12b.c delete the interrupt key word from the ADCINT_ISR. Next delete the LED toggle code and the declaration of the GPIO32_count from the beginning of ADCINT_ISR(). This is already being done with a periodic function. E. In DefaultIsr_12b.c add the following SWI_post to the ADCINT_ISR(), just after the structure used to acknowledge the PIE group: SWI_post(&ADC_swi); //post a SWI. Save and close the updated files. F. In the configuration file Lab.tcf add and setup the AdcSwi() SWI. Open Lab.tcf and click on the plus sign (+) to the left of Scheduling and again on the plus sign (+) to the left of SWI Software Interrupt Manager. G. Right click on SWI Software Interrupt Manager and select Insert SWI. Rename SWI0 to ADC_swi and click OK. This is just an arbitrary name to differentiate the AdcSwi() function itself (which is nothing but an ordinary C function) from the DSP/BIOS SWI object which we are calling ADC_swi. H. Select the Properties for ADC_swi and type _AdcSwi (with a leading underscore) in the function field. Click OK. This tells DSP/BIOS that it should run the function AdcSwi() when it executes the ADC_swi SWI. I. Next, we need to have the PIE for the ADC interrupt use the dispatcher. The dispatcher will automatically perform the context save and restore, and allow the DSP/BIOS scheduler to have insight into the ISR. You may recall from an earlier lab that the ADC interrupt is located at PIE_INT1_6. Click on the plus sign (+) to the left of HWI Hardware Interrupt Service Routine Manager. Click the plus sign (+) to the left of PIE INTERRUPTS. Locate the interrupt location for the ADC: PIE_INT1_6. Right click, select Properties, and select the Dispatcher tab. Now check the Use Dispatcher box and select OK. Close the configuration file and click YES to save changes. J. Click the Build button to rebuild and load the project. Realtime
K. Run the code (real-time mode) by using the GEL function: GEL Emulation Control Run_Realtime_with_Reset.
L. Confirm that the graphical display is showing the correct results. The results should be the same as before (i.e., filtered PWM in the upper graph, unfiltered PWM in the lower graph). Note that the Execution Graph shows the ADC_swi is being serviced rather than the DMA1_swi. M. Notice and compare the CPU computational burden servicing the ADC without using the DMA. The CPU load is now at about 41.4% as compared to 12.8% for case #7. N. Fully halt the DSP (real-time mode) by using the GEL function: GEL Emulation Control Full_Halt. End of Optional Exercise Realtime
12 - 32
4 5 6 7
12 - 33
12 - 34
ADC
RESULT0
DMA
connector wire
ePWM2 triggering ADC on period match using SOC A trigger every 20.833 s (48 kHz)
Pointer rewind
Objective: Program system into Flash Memory Learn use of CCS Flash Plug-in DO NOT PROGRAM PASSWORDS
FIR Filter
Procedure
Project File
1. A project named Lab12c.pjt has been created for this lab. Open the project by clicking on Project Open and look in C:\C28x\Labs\Lab12c. All Build Options have been configured the same as the previous lab. The files used in this lab are: Adc_9_10_12.c Filter.c CodeStartBranch.asm Gpio.c DefaultIsr_12c.c Lab.tcf DelayUs.asm Lab_12c.cmd Dma.c Labcfg.cmd DSP2833x_GlobalVariableDefs.c Main_12c.c DSP2833x_Headers_BIOS.cmd PieCtrl_12.c ECap_7_8_9_10_12.c SysCtrl.c EPwm_7_8_9_10_12.c Watchdog.c
12 - 35
Compiler Sections tab .text .switch .cinit .pinit .econst / .const .data
3. This step assigns the LOAD address of those sections that need to load to flash. Again using the memory section manager in the DSP/BIOS configuration tool (Lab.tcf), select the Load Address tab and check the Specify Separate Load Addresses box. Then set all entries to the flash memory block. 4. The section named IQmath is an initialized section that needs to load to and run from flash. Recall that this section is not linked using the DSP/BIOS configuration tool (Lab.tcf). Instead, this section is linked with the user linker command file (Lab_12c.cmd). Open and inspect Lab_12c.cmd. Previously the IQmath section was linked to L03SARAM. Notice that this section is now linked to FLASH.
12 - 36
12 - 37
load to flash (load address) but will run from L03SARAM (run address). Also notice that the linker has been asked to generate symbols for the load start, load end, and run start addresses. While not a requirement from a DSP hardware or development tools perspective (since the C28x DSP has a unified memory architecture), historical convention is to link code to program memory space and data to data memory space. Therefore, notice that for the L03SARAM memory we are linking secureRamFuncs to, we are specifiying PAGE = 0 (which is program space). 11. Using the DSP/BIOS configuration tool (Lab.tcf) confirm that the entry for L03SARAM is defined as program space (code). 12. Open and inspect Main_12c.c. Notice that the memory copy function memcpy() is being used to copy the section secureRamFuncs, which contains the initialization function for the flash control registers. 13. Add a line of code to main() to call the InitFlash() function. There are no passed parameters or return values. You just type InitFlash(); at the desired spot in main().
12 - 38
17. Using the DSP/BIOS configuration tool (Lab.tcf) define memory blocks for PASSWORDS and CSM_RSVD. You will need to setup the MEM Properties for each memory block with the proper base address and length. Set the space to code for both memory blocks. (If needed, uncheck the create a heap in this memory box for each block). You may also need to modify the existing flash memory block to avoid conflicts. If needed, a slide is available at the end of this lab showing the base address and length for the memory blocks.
12 - 39
Build Lab.out
23. At this point we need to build the project, but not have CCS automatically load it since CCS cannot load code into the flash! (the flash must be programmed). On the menu bar click: Option Customize and select the Program/Project CIO tab. Uncheck Load Program After Build. CCS has a feature that automatically steps over functions without debug information. This can be useful for accelerating the debug process provided that you are not interested in debugging the function that is being stepped-over. While single-stepping in this lab exercise we do not want to step-over any functions. Therefore, select the Debug Properties tab. Uncheck Step over functions without debug information when source stepping, then click OK. 24. Click the Build button to generate the Lab.out file to be used with the CCS Flash Plug-in.
26. A Clock Configuration window may open. If needed, in the Clock Configuration window set OSCCLK (MHz): to 30, DIVSEL: to /2, and PLLCR Value: to 10. Then click OK. In the next Flash Programmer Settings window confirm that the selected DSP device to program is F28335 and all options have been checked. Click OK. 27. Notice that the eZdsp board uses a 30 MHz oscillator (located on the board near LED DS1). Confirm the Clock Configuration in the upper left corner has the OSCCLK set to 30 MHz, the DIVSEL set to /2, and the PLLCR value set to 10. Recall that the PLL is divided by two, which gives a SYSCLKOUT of 150 MHz. 28. Confirm that all boxes are checked in the Erase Sector Selection area of the plug-in window. We want to erase all the flash sectors. 29. We will not be using the plug-in to program the Code Security Password. Do not modify the Code Security Password fields. 30. In the Operation block, notice that the COFF file to Program/Verify field automatically defaults to the current .out file. Check to be sure that Erase, Program, Verify is selected. We will be using the default wait states, as shown on the slide in this module. 31. Click Execute Operation to program the flash memory. Watch the programming status update in the plug-in window. 32. After successfully programming the flash memory, close the programmer window.
12 - 40
and select Lab12c.out in the Debug folder. 34. Reset the DSP. The program counter should now be at 0x3FF9A9, which is the start of the bootloader in the Boot ROM. 35. Single-Step <F11> through the bootloader code until you arrive at the beginning of the codestart section in the CodeStartBranch.asm file. (Be patient, it will take about 125 single-steps). Notice that we have placed some code in CodeStartBranch.asm to give an option to first disable the watchdog, if selected. 36. Step a few more times until you reach the start of the C-compiler initialization routine at the symbol _c_int00. 37. Now do Debug Go Main. The code should stop at the beginning of your main() routine. If you got to that point succesfully, it confirms that the flash has been programmed properly, and that the bootloader is properly configured for jump to flash mode, and that the codestart section has been linked to the proper address. 38. You can now RUN the DSP, and you should observe the LED on the board blinking. Try resetting the DSP and hitting RUN (without doing all the stepping and the Go Main procedure). The LED should be blinking again.
12 - 41
Position 4 Position 3 Position 2 Position 1 GPIO87 GPIO86 GPIO85 GPIO84 Right 0 Left 1 Right 0 Right 0
End of Exercise
12 - 42
FLASH
len = 0x3FF80 space = code
Lab_12.cmd
SECTIONS {
0x33 FF80
CSM_RSVD
len = 0x76 space = code
codestart csm_rsvd }
0x33 FFF6
BEGIN_FLASH
len = 0x2 space = code
0x33 FFF8
PASSWORDS
len = 0x8 space = code
0x30 0000
FLASH (256Kw)
4
0x33 7FF6
IDL_run( ) rts2800_ml.lib
6
0x3F F000
RESET
12 - 43
12 - 44
Development Support
Introduction
This module contains various references to support the development process.
Learning Objectives
Learning Objectives
TI Workshops Download Site Signal Processing Libraries TI Development Tools Additional Resources
Internet Product Information Center
13 - 1
Module Topics
Module Topics
Development Support ..............................................................................................................................13-1 Module Topics........................................................................................................................................13-2 TI Support Resources.............................................................................................................................13-3
13 - 2
TI Support Resources
TI Support Resources
http://www.ti.com/c2000
13 - 3
TI Support Resources
C2000 controlCARDs
New low cost single-board controllers perfect for initial software development and small volume system builds Small form factor (9cm x 2.5cm) with standard 100-pin DIMM interface
analog I/O, digital I/O, and JTAG signals available at DIMM interface
Galvanically isolated RS-232 interface Single 5V power supply required (not included) Available through TI authorized distributors and on the TI web
Part Numbers:
TMDSCNCD2808 TMDSCNCD28044 TMDSCNCD28335 (100 MHz F2808) (100 MHz F28044) (150 MHz F28335)
13 - 4
TI Support Resources
13 - 5
TI Support Resources
13 - 6
TI Support Resources
Number
+32 (0) 27 45 55 32 +33 (0) 1 30 70 11 64 +49 (0) 8161 80 33 11 1800 949 0107 (free phone) 800 79 11 37 (free phone) +31 (0) 546 87 95 45 +34 902 35 40 28 +46 (0) 8587 555 22 +44 (0) 1604 66 33 99 +358(0) 9 25 17 39 48
Literature, Sample Requests and Analog EVM Ordering Information, Technical and Design support for all Catalog TI Semiconductor products/tools Submit suggestions and errata for tools, silicon and documents
13 - 7
TI Support Resources
13 - 8
Note: This appendix only provides a description of the eZdsp F28335 interfaces used in this workshop. For a complete description of all features and details, please see the eZdsp F28335 Technical Reference manual.
A-1
Appendix
Module Topics
Appendix A eZdsp F28335.................................................................................................................A-1 Module Topics......................................................................................................................................... A-2 eZdsp F28335 ...................................................................................................................................... A-3 eZdsp F28335 Connector / Header and Pin Diagram ......................................................................A-3 P2 Expansion Interface....................................................................................................................A-5 P4/P8/P7 I/O Interface.....................................................................................................................A-6 P5/P9 Analog Interface....................................................................................................................A-8 P10 Expansion Interface ..................................................................................................................A-9 SW1 Boot Load Option Switch .....................................................................................................A-10 DS1/DS2 LEDs .............................................................................................................................A-11 TP1/TP2/TP3/TP4 Test Points ......................................................................................................A-11
A-2
Appendix
eZdsp F28335
eZdsp F28335 Connector / Header and Pin Diagram
A-3
Appendix
A-4
Appendix
P2 Expansion Interface
A-5
Appendix
A-6
Appendix
A-7
Appendix
A-8
Appendix
A-9
Appendix
Position 4 Position 3 Position 2 Position 1 GPIO87 GPIO86 GPIO85 GPIO84 Right 0 Left 1 Left 1 Left 1 Right 0 Left 1 Right 0 Left 1
A - 10
Appendix
DS1/DS2 LEDs
A - 11
Appendix
A - 12
Learning Objectives
Learning Objectives
Explain .sect and .usect assembly directives Explain assembly addressing modes Understand instruction formats Describe options for each addressing mode
B-1
Module Topics
Module Topics
Appendix B Addressing Modes .............................................................................................................B-1 Module Topics......................................................................................................................................... B-2 Labels, Mnemonics and Assembly Directives ......................................................................................... B-3 Addressing Modes................................................................................................................................... B-4 Instruction Formats ................................................................................................................................ B-5 Register Addressing ................................................................................................................................ B-6 Immediate Addressing............................................................................................................................. B-7 Direct Addressing ................................................................................................................................... B-8 Indirect Addressing............................................................................................................................... B-10 Review................................................................................................................................................... B-13 Exercise B.........................................................................................................................................B-14 Lab B: Addressing................................................................................................................................. B-15 OPTIONAL Lab B-C: Array Initialization in C .................................................................................... B-17 Solutions................................................................................................................................................ B-18
B-2
Mnemonics
Lines of instructions Use upper or lower case Become components of program memory
Assembly Directives
Begin with a period (.) and are lower case Used by the linker to locate code and data into specified sections Directives allow you to: Define a label as global Reserve space in memory for un-initialized variables Initialized memory .ref start .sect vectors ;make reset vector address 'start' reset: .long start .def start count .set 9 ; create an array x of 10 words x .usect mydata, 10 .sect C28OBJ MOV next: MOVL MOV loop: MOV BANZ bump: ADD SB start: code ;operate in C28x mode ACC,#1 XAR1,#x AR2,#count *XAR1++,AL loop,AR2-ACC,#1 next,UNC
Directives
initialized section .sect name
used for code or constants
B-3
Addressing Modes
Addressing Modes
Addressing Modes
Mode
(register) (constant) (paged) (pointer)
Symbol Purpose Operate between Registers # @ * Constants and Initialization General-purpose access to data Support for pointers access arrays, lists, tables
Four main categories of addressing modes are available on the C28x. Register addressing mode allows interchange between all CPU registers, convenient for solving intricate equations. Immediate addressing is helpful for expressing constants easily. Direct addressing mode allows information in memory to be accessed. Indirect addressing allows pointer support via dedicated auxiliary registers, and includes the ability to index, or increment through a structure. The C28x supports a true software stack, desirable for supporting the needs of the C language and other structured programming environments, and presents a stack-relative addressing mode for efficiently accessing elements from the stack. Paged direct addressing offers general-purpose single cycle memory access, but restricts the user to working in any single desired block of memory at one time.
B-4
Instruction Formats
Instruction Formats
Instruction Formats
INSTR dst ,src
INSTR INSTR INSTR INSTR INSTR REG REG,#imm REG,mem mem,REG mem,#imm
Example
NEG MOV ADD SUB MOV AL ACC,#1 AL,@x AL,@AR0 *XAR0++,#25
What is a REG? 16-bit Access = AR0 through AR7, AH, AL, PH, PL, T and SP 32-bit Access = XAR0 through XAR7, ACC, P, XT What is an #imm? an immediate constant stored in the instruction What is a mem? A directly or indirectly addressed operand from data memory Or, one of the registers from REG! loc16 or loc32 (for 16-bit or 32-bit data access)
The C28x follows a convention that uses instruction, destination, then source operand order (INSTR dst, src). Several general formats exist to allow modification of memory or registers based on constants, memory, or register inputs. Different modes are identifiable by their leading characters (# for immediate, * for indirect, and @ for direct). Note that registers or data memory can be selected as a mem value.
B-5
Register Addressing
Register Addressing
Register Addressing
32-bit Registers
ACC PL
P T
XT TL DP SP
16-bit Registers
PH
Allows for efficient register to register operation 16-bit and 32-bit Register Address modes Reduces code overhead, memory accesses, and memory overhead
Register addressing allows the exchange of values between registers, and with certain instructions can be used in conjunction with other addressing modes, yielding a more efficient instruction set. Remember that any mem field allows the use of a register as the operand, and that no special character (such as @, *, or #) need be used to specify the register mode.
Format Instruction
Format Instruction
B-6
Immediate Addressing
Immediate Addressing
Immediate Addressing #
one word instruction
OPCODE
8-bit OPERAND
Fixed value part of program memory instruction Supports short (8-bit) and long (16-bit) immediate constants Long immediate can include a shift Used to initialize registers, and operate with constants
Immediate addressing allows the user to specify a constant within an instruction mnemonic. Short immediate are single word, and execute in a single cycle. Long (16-bit) immediate allow full sized values, which become two-word instructions - yet execute in a single instruction cycle.
AND AND
AND AND
#16Bit
B-7
Direct Addressing
Direct Addressing
Direct addressing allows for access to the full 4-Meg words space in 64 word page groups. As such, a 16-bit Data Page register is used to extend the 6-bit local address in the instruction word. Programmers should note that poor DP management is a key source of programming errors. Paged direct addressing is fast and reliable if the above considerations are followed. The watch operation, recommended for use whenever debugging, extracts the data page and displays it as the base address currently in use for direct addressing.
Direct Addressing @
Data Page 00 0000 0000 0000 00 00 0000 0000 0000 00 00 0000 0000 0000 01 00 0000 0000 0000 01 00 0000 0000 0000 10 00 0000 0000 0000 10 11 1111 1111 1111 11 11 1111 1111 1111 11 Offset 00 0000 11 1111 00 0000 11 1111 00 0000 11 1111 00 0000 11 1111 Data Memory Page 0: 00 0000 00 003F Page 1: 00 0040 00 007F Page 2: 00 0080 00 00BF
Data memory space divided into 65,536 pages with 64 words on each page Data page pointer (DP) used to select active page 16-bit DP is concatenated with a 6-bit offset from the instruction to generate an absolute 22-bit address Access data on a given page in any order
B-8
Direct Addressing
Data Memory address data Page7[00] 0001C0 0001 64 ... Page7[3D] x: 0001FD 1000 Page7[3E] y: 0001FE 0500 Page7[3F] z: 0001FF 1500 Accumulator - - - - - - - MOV AL,@x 0 0 0 0 1 0 0 0 ADD AL,@y 0 0 0 0 1 5 0 0
MOV @z,AL
DP=0007
variations: MOVW DP,#imm ;2W, 16-bit (4 Meg) MOVZ DP,#imm ;1W, 10-bit (64K) MOV DP,#imm ;DP(15:10) unchanged
Z=X+Y
DP offset
0000 0000 0000 0001 1111 1111 0000 0000 0000 0010 0000 0000 DP=0007 0 0 0 7 0 0 0 7 0 0 0 7 Accumulator - - - - - - - 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 1
expecting 1500
x .usect .sect MOVW MOV ADD MOV samp,3 code DP,#x AL,@x AL,@y @z, AL
x y z
.usect samp,3,1 ;Force all locations to same data .set x+1 ;page (1st hole, else linker error) .set x+2 ;Assign vars within block
B-9
Indirect Addressing
Indirect Addressing
Indirect Addressing *
Data Memory XAR0 XAR1 XAR2 XAR3 XAR4 XAR5 XAR6 XAR7 ARAU
Auxiliary Registers (XARn) used to access full data memory space Address Register Arithmetic Unit (ARAU) used to modify the XARn Access data from arrays anywhere in data memory in an orderly fashion
Any of eight hardware pointers (ARs) may be employed to access values from the first 64K of data memory. Auto-increment or decrement is supported at no additional cycle cost. XAR register formats offer larger 32-bit widths, allowing them to access across the full 4-Giga words data space.
Circular: *AR6%++
AR1(7:0) is buffer size XAR6 is current address
B - 10
Indirect Addressing
y = xn
n =0
x y
.usect samp,6 .set (x + 5) .sect code MOVL XAR2,#x MOV ACC,*XAR2++ ADD ACC,*XAR2++ ADD ACC,*XAR2++ ADD ACC,*XAR2++ ADD ACC,*XAR2++ MOV *(0:y),AL
Data x x0 x1 x2 x3 x4 XAR2
y
*(0:16bit) - 16 bit label - must be in lower 64K - 2 word instruction
XAR2 [3]
x0 x1 x2 x3 x4
x .usect .samp,5 .sect .code MOVL MOV MOV MOV ADD MOV XAR2,#x AR0,#1 AR1,#3 ACC,*+XAR2[AR0] ACC,*+XAR2[AR1] *+XAR2[2],AL
x .usect .samp,5 .sect .code MOVL MOV ADD MOV XAR2,#x ACC,*+XAR2[1] ACC,*+XAR2[3] *+XAR2[2],AL
16 bit offset
3 bit offset
B - 11
Indirect Addressing
x2 = x1 + x3
- SP -
0 1 2 0 x3 3 2 0 0 5 0 x2 ? ? ? ? 0 2 0 0 x1 empty empty
Instr. 3
.code
Accumulator Instr. 1 0 0 0 0 0 2 0 0
Instr. 2
0 0 0 0 0 3 2 0
AAAA AAAA
access pointer
AAAA AAAA 0000 0000 XAR6 (32) AAAA AAAA xxxx xxxx
AAAA AAAA
Element N-1
MAC
P,*AR6%++,*XAR7++
> RAM
PAGE 1
B - 12
Review
Review
0x3FFFFF
0xFFFFFFFF
Data memory can be accessed in numerous ways: Stack Addressing: allows a range to 64K Direct Addressing: Offers a 16-bit DP plus a 6-bit offset, allowing a 4M range Indirect Addressing: Offers the full 4G range
B - 13
Review
Exercise B
Exercise B: Addressing
Given: Address/Data (hex) Fill in the table below Src Mode DP = 4000 100030 0025 100031 0120 100032 ACC DP = 4004 100100 0105 100101 0060 100102 0020 DP DP = 4006 100180 0100 100181 0030 100182 0040 AR1 XAR1 AR2 XAR2
Program MOVW DP,#4000h MOVL XAR1,#100100h MOVL XAR2,#100180h MOV AL,@31h ADD AL,*XAR1++ SUB AL,@30h ADD AL,*XAR1++ MOVW DP,#4006h ADD AL,@1 SUB AL,*XAR1 ADD AL,*XAR2 SUB AL,*+XAR2[1] ADD AL,#32 SUB AL,*+XAR2[2] MOV @32h,AL Dir: Direct; Idr: Indirect
In the table above, fill in the values for each of the registers for each of the instructions. Three areas of data memory are displayed at the top of the diagram, showing both their addresses and contents in hexadecimal. Watch out for surprises along the way. First, you should answer the addressing mode for the source operand. Then, fill in the change values as the result of the instruction operation.
B - 14
Lab B: Addressing
Lab B: Addressing
Note: The lab linker command file is based on the F28335 memory map modify as needed, if using a different F28xx device memory map.
Objective
The objective of this lab is to practice and verify the mechanics of addressing. In this process we will expand upon the ASM file from the previous lab to include new functions. Additionally, we learn how to run and observe the operation of code using Code Composer Studio. In this lab, we will initialize the vars arrays allocated in the previous lab with the contents of the const table. How is this best accomplished? Consider the process of loading the first const value into the accumulator and then storing this value to the first vars location, and repeating this process for each of the succeeding values. What forms of addressing could be used for this purpose? Which addressing mode would be best in this case? Why? What problems could arise with using another mode?
Procedure
To perform the copy, consider using a load/store method via the accumulator. Which part of an accumulator (low or high) should be used? Use the following when writing your copy routine: - use AR1 to hold the address of table - use AR2 to hold the address of data
B - 15
Lab B: Addressing
3. It is good practice to trap the end of the program (i.e. use either end: end,UNC or end: B start,UNC). Save your work.
If you wish, right click on the LabB.asm source window and select Mixed Mode to debug using both source and assembly. Note: Code Composer Studio can automatically load the output file after a successful build. On the menu bar click: Option Customize and select the Program Load Options tab, check Load Program After Build, then click OK. 6. Single-step your routine. While single-stepping, it is helpful to see the values located in table[9] and data[9] at the same time. Open two memory windows by using the View Memory button on the vertical toolbar and using the address labels table and data. Setting the properties filed to Hex 16 Bit TI style will give you more viewable data in the window. Additionally, it is useful to watch the CPU registers. Open the CPU registers by using the View Registers CPU Registers. Deselect Allow Docking and move/resize the window as needed. Check to see if the program is working as expected. End of Exercise
B - 16
Objective
The objective of this lab is to practice and verify the mechanics of initialization using C. Additionally, we learn how to run and observe the operation of C code using Code Composer Studio. In this lab, we will initialize the vars arrays with the contents of the const table.
Procedure
B - 17
Solutions
Solutions
Program MOVW DP,#4000h MOVL XAR1,#100100h MOVL XAR2,#100180h MOV AL,@31h ADD AL,*XAR1++ SUB AL,@30h ADD AL,*XAR1++ MOVW DP,#4006h ADD AL,@1 SUB AL,*XAR1 ADD AL,*XAR2 SUB AL,*+XAR2[1] ADD AL,#32 SUB AL,*+XAR2[2] MOV @32h,AL Dir: Direct; Idr: Indirect
B - 18
Learning Objectives
Learning Objectives
Perform simple program control using branch and conditional codes Write C28x code to perform basic arithmetic Use the multiplier to implement sum-of-products equations Use the RPT instruction (repeat) to optimize loops Use MAC for long sum-of-products Efficiently transfer the contents of one area of memory to another Examine read-modify-write operations
C-1
Module Topics
Module Topics
Appendix C Assembly Programming ...................................................................................................C-1 Module Topics.........................................................................................................................................C-2 Program Control.....................................................................................................................................C-3 Branches .............................................................................................................................................C-3 Program Control Instructions .............................................................................................................C-4 ALU and Accumulator Operations..........................................................................................................C-6 Simple Math & Shift...........................................................................................................................C-7 Multiplier ................................................................................................................................................C-9 Basic Multiplier ................................................................................................................................C-10 Repeat Instruction.............................................................................................................................C-11 MAC Instruction...............................................................................................................................C-12 Data Move.............................................................................................................................................C-13 Logical Operations ...............................................................................................................................C-15 Byte Operations and Addressing ......................................................................................................C-15 Test and Change Memory Instructions.............................................................................................C-16 Min/Max Operations.........................................................................................................................C-17 Read Modify Write Operations .............................................................................................................C-18 Lab C: Assembly Programming............................................................................................................C-20 OPTIONAL Lab C-C: Sum-of-Products in C........................................................................................C-22
C-2
Program Control
Program Control
The program control logic and program address generation logic work together to provide proper program flow. Normally, the flow of a program is sequential: the CPU executes instructions at consecutive program memory addresses. At times, a discontinuity is required; that is, a program must branch to a nonsequential address and then execute instructions sequentially at that new location. For this purpose, the C28x supports interrupts, branches, calls, returns, and repeats. Proper program flow also requires smooth flow at the instruction level. To meet this need, the C28x has a protected pipeline and an instruction-fetch mechanism that attempts to keep the pipeline full.
Branches
Program Memory
PC 0x3FFFFF
The PC can access the entire 4M words (8M bytes) range. Some branching operations offer 8and 16-bit relative jumps, while long branches, calls, and returns provide a full 22-bit absolute address. Dynamic branching allows a run-time calculated destination. The C28x provides the familiar arithmetic results status bits (Zero, oVerflow, Negative, Carry) plus a Test Control bit which holds the result of a binary test. The states of these bits in various combinations allow a range of signed, unsigned, and binary branching conditions offered.
C-3
Program Control
Instruction SB SBF BF LB LB BAR 8bit,cond 8bit,EQ|NEQ|TC|NTC 16bit,cond 16bit,cond 22bit *XAR7 16bit,ARn,ARn,EQ|NEQ
BANZ 16bit,ARn--
Condition flags are set on the prior use of the ALU The assembler will optimize B to SB if possible
LCR LCR
22bit *XARn
4 4
4 4 8
More Call variations in the user guide are for code backward compatibility New RPC
LCR Func LRETR RPC Old RPC Ret Addr Func Ret Addr
PC
C-4
Program Control
y = xn
n =0
Data x x0 x1 x2 x3 x4 XAR2
len x y
.set .usect .set .sect MOVL MOV MOV ADD BANZ MOV
y AR3 COUNT
sum:
C-5
MUX
8/16 Imm
ST0, ST1
One of the major components in the execution unit is the Arithmetic-Logical-Unit (ALU). To support the traditional Digital Signal Processing (DSP) operation, the ALU also has the zero cycle barrel shifter and the Accumulator. The enhancement that the C28x has is the additional data paths added form the ALU to all internal CPU registers and data memory. The connection to all internal registers helps the compiler to generate efficient C code. The data path to memory allows the C28x performs single atomic instructions read-modify-write to the memory. The following slides introduce you to various instructions that use the ALU hardware. Word, byte, and long word 32-bit operation are supported.
C-6
Ex
Ax = AH or AL Operations
} }
ACC,loc16<<shift from memory (left shift optional) ACC,#16b<<shift 16-bit constant (left shift optional) ;AL ;AH
Ax, loc16 Ax, loc16 Ax, loc16 Ax, loc16 Ax, loc16 Ax, loc16 Ax,loc16,#16b Ax Ax loc16,Ax
C SFR
Shift AL or AH
LSL LSR ASR LSL LSR ASR AX <<shift AX <<shift AX >>shift AX <<T AX <<T AX >>T
15 0 Ax 15 0 Ax 15 0 Ax
LSL
SXM
ASR
LSR
C-7
0 or 1
based on SXM
C-8
Multiplier
Multiplier
Multiply Unit
XT Register T Register 32x32 Multiply Unit 16x16 MUX Data Mem or Register
P Register (32)
Digital signal processors require many multiply and add math intensive operations. The single cycle multiplier is the second major component in the execution unit. The C28x has the traditional 16-bit-by-16-bit multiplier as previous TI DSP families. In-addition, the C28x has a single cycle 32-bit-by-32-bit multiplier to perform extended precision math operations. The large multiplier allows the C28x to support higher performance control systems requirement while maintaining small or reduce code. The following slides introduce instructions that use the 16-bit-by-16-bit multiplier and multiply and add (MAC) operations. The 32-bit-by-32-bit multiplication will be covered in the appendix.
C-9
Multiplier
Basic Multiplier
Multiplier Instructions
Instruction MOV MPY MPY MPYB MPYB MOV ADD SUB T,loc16 ACC,T,loc16 P,T,loc16 ACC,T,#8bu P,T,#8bu ACC,P ACC,P ACC,P T ACC P ACC P ACC ACC ACC Execution = loc16 = T*loc16 = T*loc16 = T*8bu = T*8bu = P += P -= P Purpose Get first operand For single or first product For n th product Using 8-bit unsigned const Using 8-bit unsigned const Move 1st product<<PM to ACC Add nth product<<PM to ACC Sub nth product<<PM fr. ACC
Instruction MOVP T, loc16 MOVA T, loc16 MOVS T, loc16 MPYA MPYA MPYS P, T, #16b P, T, loc16 P, T, loc16
Execution ACC = P<<PM T = loc16 ACC += P<<PM T = loc16 ACC - = P<<PM T = loc16 ACC += P<<PM ACC += P<<PM ACC - = P<<PM then then then P = T*#16b P = T*loc16 P = T*loc16
Sum-of-Products
Y = A*X1 + B*X2 + C*X3 + D*X4
ZAPA MOV MPY MOVA MPY MOVA MPY MOVA MPY ADDL MOVL T,@X1 P,T,@A T,@X2 P,T,@B T,@X3 P,T,@C T,@X4 P,T,@D ACC,P<<PM @y,ACC ;ACC ;T = ;P = ;T = ;P = ;T = ;P = ;T = ;P = ;ACC = P = OVC = 0 X1 A*X1 X2 ;ACC = A*X1 B*X2 X3 ;ACC = A*X1 + B*X2 C*X3 X4;ACC = A*X1 + B*X2 + C*X3 D*X4 = Y
C - 10
Multiplier
Integer long multiplication u(long) = u(long) * u(long) Fraction long multiplication: (long) = (long) * (long)
Y1 Z3
X1 Z2 Z1 Z0
Accumulator
P-register
Repeat Instruction
Features: Next instruction iterated N+1 times Saves code space - 1 word Low overhead - 1 cycle Easy to use Non-interruptible Requires use of | | before next line May be nested within BANZ loops
1 4 .N
C - 11
Multiplier
Single repeat instruction (RPT) is used to reduce code size and speed up many operations in the DSP application. Some of the most popular operations that use the RPT instruction to perform multiple taps digital filters or perform block of data transfer.
MAC Instruction
XAR1++
X0 X1 ... X19
XAR7++
Second operand must use XAR7
A0 A1 ... A19
MOV ADD
T,loc16 ACC,P
MAC
C - 12
Data Move
Data Move
.sect START: MOVL MOVL RPT || PREAD ... x .usect .sect TBL: .word len .set
Optimal with RPT (speed and code size) In RPT, non-mem address is autoincremented in PC
Faster than Load / Store, avoids accumulator Allows access to program memory
Conditional Moves
Instruction Execution (if COND is met)
= AX = 8bit = AX
Example
Accumulator 0 0 0 0 0 1 2 0 Data Memory 0 1 2 0 A 0 3 2 0 B Before Data Memory 0 1 2 0 A 0 1 2 0B After
The conditional move instruction is an excellent way to avoid a discontinuity (branch or call) based upon a condition code set prior to the instruction. In the above example, the 1st step is to
C - 13
Data Move
place the contents of A into the accumulator. Once the Ax content is tested, by using the CMP instruction, the conditional move can be executed. If the specified condition being tested is true, then the location pointed to by the loc16 addressing mode or the 8bit zero extended constant will be loaded with the contents of the specified AX register (AH or AL): if (COND == true) [loc16] = AX or 0:8bit; Note: Addressing modes are not conditionally executed. Hence, if an addressing mode performs a pre or post modification, it will execute regardless if the condition is true or not. This instruction is not repeatable. If this instruction follows the RPT instruction, it resets the repeat counter (RPTC) and executes only once. Flags and Modes N - If the condition is true, then after the move, AX is tested for a negative condition. The negative flag bit is set if bit 15 of AX is 1, otherwise it is cleared. Z - If the condition then after the move, AX is tested for a zero condition. The zero flag bit is set if AX = 0, otherwise it is cleared. V - If the V flag is tested by the condition, then V is cleared. C-Example ; if ( VarA > 20 ) ; VarA = 0; CMP @VarA,#20 ; Set flags on (VarA 20) MOVB @VarA,#0,GT ; Zero VarA if greater then
C - 14
Logical Operations
Logical Operations
Byte Operations and Addressing
Byte Operations
MOVB AX.LSB,loc16 MOVB AX.MSB,loc16 MOVB loc16, AX.LSB MOVB loc16, AX.MSB 0000 0000 Byte No change No change Byte No change Byte Byte AX AX loc16 loc16
Byte = 1. Low byte for register addressing 2. Low byte for direct addressing 3. Selected byte for offset indirect addressing
For loc16 = *+XARn[Offset] Odd Offset Even Offset loc16
Byte Addressing
AH.MSB 12 AH.LSB AL.MSB 34 56 AL.LSB 78 16 bit memory 01 78 03 05 34 07 00 56 02 04 12 06 AR2
C - 15
Logical Operations
C - 16
Logical Operations
Min/Max Operations
MIN/MAX Operations
Instruction
MAX MIN MAXL MINL MAXCUL (for 64 MINCUL (for 64 ACC,loc16 ACC,loc16 ACC,loc32 ACC,loc32 P,loc32 bit math) P,loc32 bit math) if if if if if if if if if if if if
Execution
ACC < loc16, ACC = loc16 ACC >= loc16, do nothing ACC > loc16, ACC = loc16 ACC <= loc16, do nothing ACC < loc32, ACC = loc32 ACC >= loc32, do nothing ACC > loc32, ACC = loc32 ACC <= loc32, do nothing P < loc32, P = loc32 P >= loc32, do nothing P > loc32, P = loc32 P <= loc32, do nothing
Find the maximum 32-bit number in a table: MOVL ACC,#0 MOVL XAR1,#table RPT || #(table_length 1) MAXL ACC,*XAR1++
C - 17
Read-Modify-Write Instructions
Work directly on memory bypass ACC Atomic Operations protected from interrupts
C - 18
Read-Modify-Write Examples
update with a mem update with a constant update by 1
VarA += VarB
SETC MOV ADD MOV CLRC INTM AL, @VarB AL, @VarA @VarA, AL INTM
VarA += 100
SETC MOV ADD MOV CLRC INTM AL, @VarA AL, #100 @VarA, AL INTM
VarA += 1
SETC MOV ADD MOV CLRC INTM AL, @VarA AL, INTM #1 @VarA, AL
MOV ADD
ADD
@VarA,#100
INC
@VarA
C - 19
Objective
The objective of this lab is to practice and verify the mechanics of performing assembly language programming arithmetic on the TMS320C28x. In this exercise, we will expand upon the .asm file from the previous lab to include new functions. Code will be added to obtain the sum of the products of the values from each array. Perform the sum of products using a MAC-based implementation. In a real system application, the coeff array may well be constant (values do not change), therefore one can modify the initialization routine to skip the transfer of this arrays, thus reducing the amount of data RAM and cycles required for initialization. Also there is no need to copy the zero to clear the result location. The initialization routine from the previous lab using the load/store operation will be replaced with a looped BANZ implementation. As in previous lab, consider which addressing modes are optimal for the tasks to be performed. You may perform the lab based on this information alone, or may refer to the following procedure.
Procedure
C - 20
Optional Exercise
After completing the above, edit LabC.asm and modify it to perform the initialization process using a RTP/PREAD rather than a load/store/BANZ. End of Exercise
C - 21
Objective
The objective of this lab is to practice and verify the mechanics of performing C programming arithmetic on the TMS320C28x. The objective will be to add the code necessary to obtain the sum of the products of the n-th values from each array.
Procedure
C - 22
Appendix D C Programming
Introduction
The C28x architecture, hardware, and compiler have been designed to efficiently support C code programming. Appendix D will focus on how to program in C for an embedded system. Issues related to programming in C and how C behaves in the C28x environment will be discussed. Also, the C compiler optimization features will be explained.
Learning Objectives
Learning Objectives
Learn the basic C environment for the C28x family How to control the C environment How to use the C-compiler optimizer Discuss the importance of volatile Explain optimization tips
D- 1
Module Topics
Module Topics
Appendix D C Programming.................................................................................................................D-1 Module Topics.........................................................................................................................................D-2 Linking Boot code from RTS2800.lib ......................................................................................................D-3 Set up the Stack .......................................................................................................................................D-4 C28x Data Types.....................................................................................................................................D-5 Accessing Interrupts / Status Register.....................................................................................................D-6 Using Embedded Assembly .....................................................................................................................D-7 Using Pragma .........................................................................................................................................D-8 Optimization Levels ................................................................................................................................D-9 Volatile Usage ..................................................................................................................................D-11 Compiler Advanced Options ............................................................................................................D-12 Optimization Tips Summary.............................................................................................................D-13 Lab D: C Optimization..........................................................................................................................D-14 OPTIONAL Lab D2: C Callable Assembly...........................................................................................D-17 Solutions................................................................................................................................................D-20
D- 2
_main
...
The boot routine is used to establish the environment for C before launching main. The boot routine begins with the label _c_int00 and the reset vector should contain a ".long" to this address to make boot.asm the reset routine. The contents of the boot routine have been extracted and copied on the following page so they may be inspected. Note the various functions performed by the boot routine, including the allocation and setup of the stack, setting of various C-requisite statuses, the initialization of global and static variables, and the call to main. Note that if the link was performed using the "cr" option instead of the "c" option that the global/static variable initialization is not performed. This is useful on RAM-based C28x systems that were initialized during reset by some external host processor, making transfer of initialization values unnecessary. Later on in this chapter, there is an example on how to do the vectors in C code rather than assembly.
D-3
The Stack
The C/C++ compiler uses a stack to: Allocate local variables Pass arguments to functions Save the processor status Save the function return .stack address Save temporary results The compiler uses the hardware stack pointer (SP) to manage the stack. SP defaults to 0x400 at reset. The run-time stack grows from low addresses to higher addresses.
SP 0x400 (reset)
Callers local vars Arguments passed on stack Return address Function return addr Temp results
64K 4M
The C28x has a 16-bit stack pointer (SP) allowing accesses to the base 64K of memory. The stack grows from low to high memory and always points to the first unused location. The compiler uses the hardware stack pointer (SP) to manage the stack. The stack size is set by the linker.
The .stack section has to be linked into the low 64k of data memory. The SP is a 16-bit register and cannot access addresses beyond 64K. Stack size is set by the linker. The linker creates a global symbol, --STACK-SIZE, and assigns it a value equal to the size of the stack in bytes. (default 1K words) You can change stack size at link time by using the -stack linker command option.
Note: The compiler provides no means to check for stack overflow during compilation or at runtime. A stack overflow disrupts the run-time environment, causing your program to fail. Be sure to allow enough space for the stack to grow.
In order to allocate the stack the linker command file needs to have align = 2.
D- 4
float 32 IEEE single precision double 64 IEEE double precision long double 64 IEEE double precision Suggestion: Group all longs together, group all pointers together
Data which is 32-bits wide, such as longs, must begin on even word-addresses (i.e. 0x0, 0x2, etc). This can result in holes in structures allocated on the stack.
D-5
Interrupt Enable & Interrupt Flag Registers (IER, IFR) are not memory mapped Only limited instructions can access IER & IFR (more in interrupt chapter) The compiler provides extern variables for accessing the IER & IFR
D- 6
Embedding Assembly in C
Allows direct access to assembly language from C Useful for operating on components not used by C, ex:
INTM
asm ( CLRC
Note: first column after leading quote is label field - if no label, should be blank space. Avoid modifying registers used by C Lengthy code should be written in ASM and called from C main C file retains portability yields more easily maintained structures eliminates risk of interfering with registers in use by C
The assembly function allows for C files to contain 28x assembly code. Care should be taken not to modify registers in use by C, and to consider the label field with the assembly function. Also, any significant amounts of assembly code should be written in an assembly file and called from C. There are two examples in this slide the first one shows how to embed a single assembly language instruction into the C code flow. The second example shows how to define a C term that will invoke the assembly language instruction.
D-7
Using Pragma
Using Pragma
Pragma is a preprocessor directive that provides directions to the compiler about how to treat a particular statement. The following example shows how the DATA_SECTION pragma is used to put a specific buffer into a different section of RAM than other buffers. The example shows two buffers, bufferA and bufferB. The first buffer, bufferA is treated normally by the C compiler by placing the buffer (512 words) into the ".bss" section. The second, bufferB is specifically directed to go into the my_sect portion of data memory. Global variables, normally ".bss", can be redirected as desired. When using CODE_SECTION, code that is normally linked as ".text", can be identified otherwise by using the code section pragma (like .sect in assembly).
Pragma Examples
User defined sections from C :
#pragma CODE_SECTION (func, section name) #pragma DATA_SECTION (symbol, section name)
D- 8
Optimization Levels
Optimization Levels
Optimization Scope
FILE1.C { { SESE } { } SESE: Single Entry, Single Exit } { . . . } FILE2.C { . . . } ... LOCAL single block FUNCTION across blocks FILE across functions
-o0, -o1
-o2
-o3
-pm -o3
Optimizations fall into 4 categories. This is also a methodology that should be used to invoke the optimizations. It is recommended that optimization be invoked in steps, and that code be verified before advancing to the next step. Intermediate steps offer the gradual transition from fully symbolic to fully optimized compilation. Compiler switched may be invoked in a variety of ways. Here are 4 steps that could be considered: 1st: use g By starting out with g, you do no optimization at all and keep symbols for debug. 2nd: use g o3 The option o3 might be too big a jump, but it adds the optimizer and keeps symbols. 3rd: use g o3 mn This is a full optimization, but keeps some symbols 4th: use o3 Full optimization, symbols are not kept.
D-9
Optimization Levels
Optimization Performance
o0
LOCAL Performs control-flow-graph simplification Allocates variables to registers Performs loop rotation Eliminates unused code Simplifies expressions and statements Expands calls to functions declared inline Performs local copy/constant propagation Removes unused assignments Eliminates local common expressions Default (-o) Performs loop optimizations Eliminates global common sub-expressions Eliminates global unused assignments Removes all functions that are never called Simplifies functions with return values that are never used Inlines calls to small functions Identifies file-level variable characteristics
o1 o2
FUNCTION
o3
FILE
PROGRAM
o3 pm
Optimizer levels zero through three, offer an increasing array of actions, as seen above. Higher levels include all the functions of the lower ones. Increasing optimizer levels also increase the scope of optimization, from considering the elements of single entry, single-exit functions only, through all the elements in a file. The -pm option directs the optimizer to view numerous input files as one large single file, so that optimization can be performed across the whole system.
D- 10
Optimization Levels
Volatile Usage
When using optimization, it is important to declare variables as volatile when: The memory location may be modifed by something other than the compiler (e.g. its a memory-mapped peripheral register). The order of operations should not be rearranged by the compiler Define the pointer as volatile to prevent the optimizer from optimizing
D - 11
Optimization Levels
- Used with o3 option - Functions > size are not auto inlined
Note: To prevent code size increases when using o3, disable auto inlining with -oi0 The next point we will cover is the Normal Optimization with Debug (-mn). - Re-enables optimizations disabled by g option (symbolic debug) - Used for maximum optimization Note: Some symbolic debug labels will be lost when mn option is used.
Optimizer should be invoked incrementally: -g test -g -o3 test -g -o3 -mn test -o3 test
Symbols kept for debug Add optimizer, keep symbols More optimize, some symbols Final rev: Full optimize, no symbols
[-mf] : Optimize for speed instead of the default optimization for code size [-mi] : Avoid RPT instruction. Prevent compiler from generating RPT instruction. RPT instruction is not interruptible [-mt] : Unified memory model. Use this switch with the unified memory map of the 281x & 280x. Allows compiler to generate the following: -RPT PREAD for memory copy routines or structure assignments -MAC instructions -Improves efficiency of switch tables
D- 12
Optimization Levels
Tune memory map via linker command file Re-write key code segments to use intrinsics or in assembly
App notes 3rd Parties
The list above documents the steps that can be taken to achieve increasingly higher coding efficiency. It is recommended that users first get their code to work with no optimization, and then add optimizations until the required performance is obtained.
D - 13
Lab D: C Optimization
Lab D: C Optimization
Note: The lab linker command file is based on the F28335 memory map modify as needed, if using a different F28xx device memory map.
Objective
The objective of this lab is to practice and verify the mechanics of optimizing C programs. Using Code Composer Studio profile capabilities, different routines in a project will be benchmarked. This will allow you to analyze the performance of different functions. This lab will highlight the profiler and the clock tools in CCS.
Procedure
D- 14
Lab D: C Optimization
5. Set a breakpoint on the NOP in the while(1) loop at the end of main() in LabD.c. 6. Set up the profile session by selecting Profiler a session name of your choice (i.e. LabD). Start New Session. Enter
7. In the profiler window, hover the mouse over the icons on the left region of the window and select the icon for Profile All Functions. Click on the + to expand the functions. Record the Code Size of the function sop C code in the table at the end of this lab. Note: If you do not see a + beside the .out file, press Profile All Functions on the horizontal tool bar. (You can close the build window to make the profiler window easier to view by right clicking on the build window and selecting hide). 8. Select F5 or the run icon. Observe the values present in the profiling window. What do the numbers mean? Click on each tab to determine what each displays.
Benchmarking Code
9. Lets benchmark (i.e.count the cycles need by) only a portion of the code. This requires you to set a breakpoint pair on the starting and ending points of the benchmark. Open the file sop-c.c and set a breakpoint on the for statement and the return statement. 10. In CCS, select Profile Setup. Check Profile all Functions and Loops for Total Cycles and click Enable Profiling. Then select Profile viewer. 11. Now Restart the program and then Run the program. The program should be stopped at the first breakpoint in sop. Double click on the clock window to set the clock to zero. Now you are ready to benchmark the code. Run to the second breakpoint. The number of cycles are displayed in the viewer window. Record this value in the table at the end of the lab under C Code - Cycles.
C Optimization
12. To optimize C code to the highest level, we must set up new Build Options for our Project. Select the Compiler tab. In the Basic Category Panel, under Opt Level select File (-o3). Then select OK to save the Build Options. 13. Now Rebuild the program and then Run the program. The program should be stopped at the first breakpoint in sop. Double click on the clock window to set the clock to zero. Now you are ready to benchmark the code. Run to the second breakpoint. The number of cycles are displayed in the clock window. Record this value in the table at the end of the lab under Optimized C (-o3) - Cycles. 14. Look in your profile window at the code size of sop. Record this value in the table at the end of this lab.
D - 15
Lab D: C Optimization
16. Start a new profile session and set it to profile all functions. Run to the first breakpoint and study the profiler window. Record the code size of the assembly code in the table. 17. Double Click on the clock to reset it. Run to the last breakpoint. Record the number of cycles the assembly code ran. 18. How does assembly, C code, and optimized C code compare on the C28x? C Code Code Size Cycles Optimized C Code (-o3) Assembly Code
End of Exercise
D- 16
Objective
The objective of this lab is to practice and verify the mechanics of implementing a C callable assembly programming. In this lab, a C file will be used to call the sum-of-products (from the previous Appendix LabC exercise) by the main routine. Additionally, we will learn how to use Code Composer Studio to configure the C build options and add the run-time support library to the project. As in previous labs, you may perform the lab based on this information alone, or may refer to the following procedure.
Procedure
D - 17
12. Next, we need to add code to initialize the sum-of-products parameters properly, based on the passed parameters. Add the following code to the first few lines after entering the _sop routine: (Note that the two pointers are passed in AR4 and AR5, but one needs to be placed in AR7. The loop counter is the third argument, and it is passed in the accumulator.) MOVL MOV SUBB XAR7,XAR5 AR5,AL XAR5,#1 ;XAR7 points to coeff [0] ;move n from ACC to AR5 (loop counter) ;subtract 1 to make loop counter = n-1
Before beginning the MAC loop, add statements to set the sign extension mode, set the SPM to zero, and a ZAPA instruction. Use the same MAC statement as in Lab 4, but use XAR4 in place of XAR2. Make the repeat statement use the passed value of n-1 (i.e. AR5). RPT AR5 ;repeat next instruction AR5 times
D- 18
Now we need to return the result. To return a value to the calling routine you will need to place your 32-bit value in the ACC. What register is the result currently in? Adjust your code, if necessary. 13. Save the assembly file as sop-asm.asm. (Do not name it LabD2.asm because the compiler has already created with that name from the original LabD2.c code).
End of Exercise
D - 19
Solutions
Solutions
Lab D Solutions
D- 20
Learning Objectives
Learning Objectives
Architecture floating-point format, registers, and pipeline Instructions instruction types, delay slots, parallel instructions, RPTB, floating-point flags Instruction Summary
E-1
Module Topics
Module Topics
Appendix E Floating-Point Unit............................................................................................................E-1 Module Topics......................................................................................................................................... E-2 FPU Format, Registers, and Pipeline..................................................................................................... E-3 Instruction Types and Formats ............................................................................................................... E-6 Interrupts and Code Comparisions....................................................................................................... E-12 Instruction Summary............................................................................................................................. E-17
E-2
S 0 0 0 0 0 1 1 1 1 1
E 0 0
M 0 Non-Zero
1-254 0-0x7FFFF Positive or Negative Values* 255 (max) 0 Positive or Negative Infinity 255 (max) Non-Zero Not a Number (NaN)
* Normal Positive and Negative Values are Calculated as: ( -1 ) s x 2 (E-127) x 1.M +/- ~1.7 x 10 -38 to +/- ~3.4 x 10 +38 The Normalized IEEE numbers have a hidden 1; thus the equivalent The Normalized IEEE numbers have a hidden 1; thus the equivalent signed integer resolution is the number of mantissa bits + sign + 1 signed resolution is the number of mantissa bits + sign 1
Denormalized values are treated as zero Not-a-number (NaN) is treated as infinity Round-to-Zero Mode Supported (truncate) Round-to-Nearest Mode Supported (even)
E-3
32-bit Accumulator Product Temporary 8 Auxiliary 22-bit Program Counter Return PC 16-bit Data Page Pointer Stack Pointer 2 Status Interrupt Enable Interrupt Flag
R0H R7H and STF are shadowed for fast context save and restore
F1 F2 D1 D2 R1 R 2 E W D R E1 E2/W
CMP/MIN/MAX/NEG/ABS MPY/ADD/SUB/MACF32
FPU Instruction
Load Store No required delay slot instruction 1 required delay slot instruction
Floating-point operations are not pipeline protected Some instructions require delay slots for the operation to complete Insert NOPs or other non-conflicting instructions between operations
E-4
Assembler issues errors for pipeline conflicts Fill delay slots with non-conflicting instructions to improve performance 3 general guidelines:
Math 2p cycles One delay slot Conversion I16TOF32, F32TOI16, 2p cycles F32TOI16R, etc One delay slot Everything Else* Load, Store, Compare, Single cycle Min, Max, Absolute No delay slot and Negative value
* Note: MOV32 between FPU and CPU registers is a special case
E-5
Fixed-Point: Floating-Point:
MPY MPYF32
Destination
MOV32 MOV32 MOVD32 MOV32 MOV32 CMPF32 ABSF32 SAVE UI16TOF32 I32TOF32 F32TOI16R F32TOI32
* Moves between CPU and FPU registers require additional pipeline alignment
E-6
MPYF32 || MOV32
Math Operations
MPYF32 R2H, R1H, R0H NOP <any instruction> ; ; ; ; ; ; ; ; ; ; ; 2p instruction 1 cycle delay <- MPYF32 completes, R2H valid Can use R2H
MPYF32 R2H, R1H, R0H ADDF32 R3H, R3H, R1H NOP MOV32 *XAR7, R2H NOP <any instruction>
2p instruction 1 cycle delay for MPYF32 <- MPYF32 completes, R2H valid 1 cycle delay for ADDF32 Can use R2H <- ADDF32 complete, R3H valid <- MOV32 complete
Math Operations:
2p (2 pipelined) cycles Can be launched every cycle Result is valid 2 instructions later 1 cycle
Move Operation:
E-7
Parallel Instructions
Single Instruction Single Opcode Performs 2 Operations Example: Add + Parallel Store Parallel Bars Indicate A Parallel Instruction
||
Instruction Multiply & Parallel Add/Subtract Multiply, Add, Subtract, Mac & Parallel Load/Store Min or Max & Parallel Move
; After: R2H = ? R1H * R0H = 3.0 * 2.0 ; *XAR3 = ? 10.0 Math Operations: 2p (2 pipelined) cycles Can be launched every cycle Result is valid 2 instructions later 1 cycle
Move Operation:
Parallel Instruction: MOV32 used the value of R2H before the MPY32 update
E-8
Math Operations:
2p (2 pipelined) cycles Can be launched every cycle Result is valid 2 instructions later 1 cycle
Move Operation:
Parallel Instruction: MOV32 used the value of R2H before the MPY32 update
Latched overflow & underflow Negative & Zero Float Negative & Zero Integer Test flag Rounding mode Shadow status
Math: MPYF32, ADDF32, SUBF32, 1/X Connected to the PIE for debug Move operations on registers Result of compare, min/max, absolute value and negative value TESTTF Instruction To Zero (truncate) or To Nearest (even) For fast interrupt context save/restore
Loop: MOV32 MOV32 CMPF32 Use MOVST0 to copy FPU MOVST0 flags to ST0 BF
R0H,*XAR4++ R1H,*XAR3++ R1H, R0H ZF, NF Loop, GT ; Loop if (R1H > R0H)
E-9
Repeat Block
Improves performance of block algorithms (FFT, IIR) 0 cycles after the first iteration
RPTB #label, #RC RPTB #label, loc16
Repeat Block Register RB (32) RAS RA 31 30 RSIZE(7) 29:23 RE(7) 22:16 RC(15) 15:0
RA = 1 RPTB Active RA = 0
; find the largest element and ; put its address in XAR6 RC: Repeat Count .align 2 Block Executes RC+1 NOP RPTB VECTOR_MAX_END, AR7 MOVL ACC,XAR0 RSIZE: RPTB Block Size MOV32 R1H,*XAR0++ Max: 127 x 16 Words MAXF32 R0H,R1H Min: 8 words (odd aligned) MOVST0 NF,ZF 9 words (even aligned) MOVL XAR6,ACC,LT VECTOR_MAX_END:
RE: End address
E - 10
FPU Register to CPU Register MOV32 @XARn,RaH MOV32 @ACC,RaH MOV32 @XT,RaH MOV32 @P,RaH
1 Cycle FPU Operation: Requires 1 instruction delay
R0H, @ACC
R2H,R1H,R0H
Do not use these in the delay slots: FRACF32, UI16TOF32, I16TOF32, F32TOUI32, F32TOI32
E - 11
Note: The following critical registers are automatically saved on an interrupt: ACC, P, XT, ST0, ST1, IER, DP, AR0, AR1, PC
R 0H,*--SP S TF,*--SP XT X AR7 X AR6 X AR5 X AR4 X AR3 X AR2 A R1H:AR0H D P,#PIE I NTM RB
42 cycles (worst case if need to save all registers) 32 cycles (worst case if need to restore all registers) (interrupt disabled for 21 cycles) (interrupt disabled for 14 cycles)
E - 12
Only uses SAVE & RESTORE in high priority interrupts Assumes interrupts are low priority by default
Compatible with existing C28x code
71 Words 72 Cycles
29 Words 33 Cycles
E - 13
66 Words 70 Cycles
; Floating-Point Square Root __sqrt: MOV32 *SP++,R4H CMPF32 R0H,#0.0 MOVST0 ZF,NF B L1,EQ EISQRTF32 R1H,R0H MOVIZF32 R2H,#0.5 MPYF32 R2H,R0H,R2H MOVIZF32 R3H,#1.5 MPYF32 R4H,R1H,R2H NOP L1: MPYF32 R4H,R1H,R4H NOP SUBF32 R4H,R3H,R4H NOP
42 Words 31 Cycles
E - 14
E - 15
; 14 cycles ; 32 bytes
E - 16
Instruction Summary
Instruction Summary
Floating-Point Instructions
MOVIZ MOVXI MOV32 MOVD32 MOV32 MOV16 MOV32 MOV32 ZEROA ZERO TESTTF SWAPF MOV32 Instructions RaH,#16F RaH,#16I RaH,mem32{,CNDF} RaH,mem32 mem32,RaH mem16,RaH CPUReg,FPUReg FPUReg,CPUReg Cycles 1 1 1 1 1 1 1* 1* 1 1 1 1 1 MOV32 MOV32 MOVST0 SETFLG SAVE RESTORE PUSH POP RPTB RPTB RB RB #Label,#count #Label,loc16 Instruction STF,mem32 mem32,STF FLAG FLAG,VALUE FLAG,VALUE Cycles 1 1 1 1 1 1 1 1 1 1
* Note: Move between CPU and FPU registers requires special pipeline alignment
Floating-Point Instructions
Instruction I16TOF32 RaH,mem16 I16TOF32 RaH,RbH UI16TOF32 RaH,mem16 UI16TOF32 RaH,RbH F32TOI16 RaH,RbH F32TOI16R RaH,RbH F32TOUI16 RaH,RbH F32TOUI16R RaH,RbH I32TOF32 I32TOF32 UI32TOF32 UI32TOF32 F32TOI32 F32TOUI32 RaH,mem32 RaH,RbH RaH,mem32 RaH,RbH RaH,RbH RaH,RbH Cycles 2p 2p 2p 2p 2p 2p 2p 2p 2p 2p 2p 2p 2p 2p Instruction CMPF32 RaH,RbH CMPF32 RaH,#16F CMPF32 RaH,#0.0 MAXF32 MAXF32 MINF32 MINF32 RaH,RbH RaH,#16F RaH,RbH RaH,#16F Cycles 1 1 1 1 1 1 1 1 1 1/1 1/1
ABSF32 RaH,RbH NEGF32 RaH,RbH{,CNDF} MAXF32 || MOV32 MINF32 || MOV32 RaH,RbH RcH,RdH RaH,RbH RcH,RdH
E - 17
Instruction Summary
Floating-Point Instructions
Instruction EISQRTF32 RaH,RbH EINVF32 RaH,RbH FRACF32 RaH,RbH MPYF32 ADDF32 SUBF32 MPYF32 MPYF32 ADDF32 ADDF32 SUBF32 MPYF32 || SUBF32 MPYF32 || ADDF32 RaH,RbH,RcH RaH,RbH,RcH RaH,RbH,RcH RaH,RbH,#16F RaH,#16F,RbH RaH,RbH,#16F RaH,#16F,RbH RaH,#16F,RbH RaH,RbH,RcH RdH,ReH,RfH RaH,RbH,RcH RdH,ReH,RfH Cycles 2p 2p 2p 2p 2p 2p 2p 2p 2p 2p 2p 2p/2p 2p/2p MPYF32 || MOV32 MPYF32 || MOV32 ADDF32 || MOV32 ADDF32 || MOV32 SUBF32 || MOV32 SUBF32 || MOV32 MACF32 || MOV32 MACF32 || MOV32 Instruction RdH,ReH,RfH RaH,mem32 RdH,ReH,RfH mem32,RaH RdH,ReH,RfH RaH,mem32 RdH,ReH,RfH mem32,RaH RdH,ReH,RfH RaH,mem32 RdH,ReH,RfH mem32,RaH R3H,R2H,RdH,ReH,RfH RaH,mem32 R7H,R6H,RdH,ReH,RfH RaH,mem32 Cycles 2p/1 2p/1 2p/1 2p/1 2p/1 2p/1 2p/1 2p/1
MACF32
Cycles 2p+N
Note: UNCF: This test in conditional operations can modify flags (based on destination register value). UNC, NEQ, EQ, GT, GEQ, LT, LEQ, TF, NTF, LU, LV: These tests in conditional operations do not modify flags.
E - 18