An Embedded Software Primer

Download as pdf or txt
Download as pdf or txt
You are on page 1of 225

low

ICE

DITION

An Embedded Software Primer

Software Primer

David E. Simon

Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where thosre designations appear in this book, and Pearson Education was aware of a trademark claim, the designations have been printed in initial caps or in all caps.

To A. J. Nichols

The author and publisher have taken care in the preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or ommissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein.

Copyright Ibl 1999 by Pearson Education, Inc.

This edition is published by arrangement with Pearson Education, Inc.

This book sold subject to the condition that it shall not, by way of trade or otherwise, be lent, resold, hired out, or otherwise circulated without the publisher's prior written consent in any form of binding or cover other than that in which it is published and without a similar condition including this condition being imposed on the subsequent purchaser and without limiting the rights under copyright reserved above, no part ofthis publication may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording or otherwise), without the prior written permission of both the copyright owner and the above-mentioned publisher of this book.

ISBN 81-7808-045-1

First Indian Reprint, 2000 Second Indian Reprint, 2001 Third Indian Reprint, 2001 Fourth Indian Reprint, 200 I Fifth Indian Reprint, 2002 Sixth Indian Reprint, 2002 Seventh Indian Reprint, 2002 Eighth Indian Reprint, 2003 Ninth Indian Reprint, 2003 Tenth Indian Reprint, 2004 Eleventh Indian Reprint, 2004 Twelfth Indian Reprint, 2005

This edition is manufactured ill India and is authorized for sale only in india, Bangladesh, Pakistan, Bhutan, Nepal, Sri Lanka and the Maldives.

Published by Pearson Education (Singapore) Pte. Ltd., Indian Branch, 482 1'.I.E. Patparganj, Delhi 110 on, India

Printed in India by Tan Prints (1) Pvt. Ltd.

Contents

Preface xi Acknowledgments xiii

About This Book and the Accompanying CD-ROM xv

1

A First Look at Embedded Systems 1.1 Examples of Embedded Systems

1.2 Typical Hardware 8

Chapter Summary 10

1

2 Hardware Fundamentals for the Software Engineer 13

2.1 Terminology 13

2.2 Gates 16

2.3 A Few Other Basic Considerations 20 2.4 Timing Diagrams 28

2.5 Memory 33

Chapter Summary 40 Problems 41

3

Advanced Hardware Fundamentals

45

3.1 Microprocessors 45 3.2 Buses 47

3.3 Direct Memory Access 57 3.4 Interrupts 61

3.5 Other Common Parts 62

vin CONTF.NTS

3.6 Built-Ins on the Microprocessor 72 3.7 Conventions Used on Schematics 75 3.8 A Sample Schematic 75

3.9 A Last Word about Hardware 77

Chapter Summary Problems 79

78

4

Interrupts 81

4.1 Microprocessor Architecture 81 4.2 Interrupt Basics 85

4.3 The Shared-Data Problem 92 4.4 Interrupt Latency 103

Chapter Summary 111 Problems 112

5

Survey of Software Architectures 115

5.1 Round-Robin 115

5.2 Round-Robin with Interrupts 119

5.3 Function-Queue-Scheduling Architecture 127

5.4 Real-Time Operating System Architecture 129

5.5 Selecting an Architecture 132

Chapter Summary 133 Problems 134

6

Introduction to Real-Time Operating Systems 137

6.1 Tasks and Task States 139

6.2 Tasks and Data 144

6.3 Semaphores and Shared Data 153 Chapter Summary 168

Problems 169

10

___ ---_------ ------

IX

CONTENTS

7

More Operating System Services 7.1 Message Queues, Mailboxes, and Pipes 173 7.2 Timer Functions 184

7.3 Events 191

7.4 Memory Management 195

7.5 Interrupt Routines in an RTOS Environment Chapter Summary 206

Problems 207

173

199

8

Basic Design Using a Real-Time Operating System 215

8.1 Overview 215

8.2 Principles 217

8.3 An Example 233

8.4 Encapsulating Semaphores and Queues' 244

8.5 Hard Real-Time Scheduling Considerations .253 8.6 Saving Memory Space 254

8.7 Saving Power 257

Chapter Summary 259 Problems 260

9

Embedded Software Development Tools 9.1 Host and Target Machines 261

9.2 Linker/Locators for Embedded Software 263

9.3 Getting Embedded Software into the Target System 276 Chapter Summary 280

261

Debugging Techniques

283

10.1 Testing on Your Host Machine 284 10.2 Instruction Set Simulators 302

10.3 The assert Macro 304

x (,:ONIL~TS

10.4 Using Laboratory Tools 307 Chapter Summary 326 Problems 327

11

An Example System

329

11. 1 What the Program Does 330

1 J. 2 Environment in Which the Program Operates 11.3 A Guide to the Source Code 336

11.4 Source Code 339

SUlllmary 402 Problems 403

Afterword 405 Further Reading 407 Index 409

Preface

333

This book is to help you learn the basic principles of writing software tor embedded systems. It surveys the issues and discusses the various techniques for dealing with them. In particular, it discusses approaches to the appropriate use of the real-time operating systems upon which much embedded software is based. In addition to explaining what these systems do, this book points out how you can use them most effectively.

You need know nothing about embedded-systems software and its problems to read this book; we'll discuss everything from the very beginning. You should be familiar with basic computer programming concepts: you might be a software engineer with a year or more of experience, or perhaps a student with a few programming courses under your belt. You should understand the problems involved in writing application programs. This book requires a reading knowledge of the C programming language; since C is the lingua franca of embedded systems, you will have to learn it sooner or later if you hope to get into the field. A little knowledge of assembly language will also be helpful.

You have no doubt seen many books about software that arc 800 or 900 or even 1000 pages long. Presumably you have noticed by now that this book is much smaller than that. This is intentional-the idea is that you might actually want to read all the way through it. This book is not entitled E!lClythiH,R There Is to Know about Embedded Systems Software. Nobody could write that book, and if someone could and did, you wouldn't want to read it anyway. This book is more like H1wt YCl1l Need to KnoUJ to Get Started in Embedded Systems Soitwave, telling you enough that you'll understand the issues you will face and getting you started on finding the information about your particular system so that you can resolve those issues.

This book is not specific to any microprocessor or real-time operating system nor is it oriented towards any particular software design methodology. The principles are the same, regardless of which microprocessor and which realtime operating system and which software design methodology you use. We will concentrate on the principles-principles that you can apply to almost

Xli

------- ------

PREFACE

any embedded system project. When you need to know the of your microprocessor and your real-time operating system, look in the VOIUllll;!OUS manuals that hardware and software vendors provide with their products. This book will help you know what information to look tor.

This book is not academic or theoretical; it offers engineering information and engineering advice.

In short, this book is the cornerstone of the knowledge that you'll need for writing embedded-systems software.

Acknowledgments

David E. Simon

No one has enough good ideas for a book such as this or the perseverance to see It through without help from other people. So here+-rnore or less in chronological order-v-is the story of this book and the people who helped me turn it into reality.

First, thanks are due to the people at Probitas Corporation: to A. J. Nichols, who has made the company the thoughtful. high-quality software environment that it is; to Michael Grischy for the ongoing debates on embedded-system design and coding style; to Richard Steinberg, who checked the code examples in this book; and to Ric Vilbig, who reviewed the two chapters on hardware and corrected a number of my misconceptions.

My wife, Lynn Gordon, encouraged me to write this book, predictingcorrectly, as it turned out-that I would enjoy doing it. Thank you for getting me started, and thanks for the editing help ... even if you are always right about the fine points of English usage.

Thank you to a half-dozen classes fill! of students: to those of you who asked the advanced questions and forced me to clarify my thinking, to those of you who asked the elementary questions and forced me to clarify my explanations. and to all of you who suffered in silence with early versions of the manuscript.

Thanks to the smart people at Dragon Systems, Inc., who wrote Natura/lySpeaking, a voice recognition program good enough to allow me to prepare a manuscript while injuries prevented me from typing much.

A huge thanks to Jean Labrosse tor giving permission to include his real-time operating system, i-CC/OS, as part of this book. You have done the world a real favor in writing this system and in allowing it to be used freely tor educational purposes.

A thank you to John Keenan, who taught me a lot of what I know about hardware and who straightened out a few blunders that made it into the manuscript.

The following people reviewed the first version of the manuscript and provided reams of good suggestions, many of which I incorporated into the

XIV /\CKNOWTEDGMENTS

book: Antonio Bigazzi, Fred Clegg, David Cub, Michael Eager, John Kwan, Tom Lyons, and Steve Vinoski. Thank you all.

Thank> are due to Mike Hulme at Zilog, who gave perrrussron to use the schematic example at the end of Chapter 3 and who ran down a legible copy of it.

Finally, thanks to Debbie Lafferty and Jacquelyn Doucette, who shepherded this book through its various stages; to Ann Knight and Regina Knox, who pointed out all of those things that had somehow made sense when I wrote them but didn't later when someone else tried to read them; and to Laurel Muller, who turned my scruffv sketches into legible figures.

About This Book and the Accompanying CD-ROM

----------~--~--~~-------------------- ... -------------

The Perversities of Embedded Systems

One very unfortunate aspect of embedded systems is that the terminology surrounding them is not very consistent. For every concept, there are two or three different words. For every word, there are four or five subtly different meanings. You will Just have to live with this problem. In this book we will point out the variations in meaning; then we will assign a specific meaning for each word so that we don't get confused. When yO\1 are reading other books O[ talking to people, however, you'll have to be aware that their words may mean something slightly different from whatthey mean in this book.

Another unfortunate problem is that the term embedded systems covers such a broad range of products that generalizations are difficult. Systems are built with microprocessors as diverse as the Z8, an S-bit microprocessor that cannot even use external memory, to the PowerPC, a 32-bit microprocessor that can access gigabytes. The code size of the applications varies from under 500 bytes to millions of bytes.

Because of this and because of the wide variety of applications, embedded software is a field in which no wisdom IS universal. Any rule followed by 85 percent of engineers as part of the accepted gospel of best practice has to be broken by the other 15 percent just to get their systems to work. This book will focus on the rules of the 85 percent, emphasizing the concepts and the reasoning behind th'e rules and helping YOll decide whether you should follow the common practice or if your project is part of the 15 percent.

Chapter Dependencies in This Book

Although this book is intended to be rcad straight through, and although every chapter depends at least a little upon the chapters that precede it, you can skip around if you Eke. Since this book starts every subject at the very beginning, you may be able to skip some sections if you already know some of the material.

XVI ABOUT THIS BOOK

The most important dependencies among the chapters are shown in th e dia gram

here. .

A First Look at

LIlO Debugging

Techniques i

___ ~___J

If you already know about hardware, for example, or if your work doesn't require that you know anything about it, you can skip Chapters 2 and 3. However, don't try ,0 read. Chapter 6 without reading Chapter 4 or knowing about the material in it.

C++

Although C++ is an increasingly popular language in the embedded-systems world, you will not see it in this book. This is not intended to discourage you from using C++ ,which is popular in the crubcdded-sv.tcms world for the same good reasons it is popular in the applications world. However, one of the acknowledgeddisadvantages of C++ is that it is a complicated language, in many ways much more difficult and subtle (hall C The programmmg principles discussed in this book applY,equally ro C and to c++ (and to Ada, Java, BASIC,

ABOUT THIS BOOK XVll

and any other llnguage in which you might choose to program your embedded system, for that matter). Therefore, for the purposes of illustration in this book, itmakes sense to steer clear of the complications of C++ .

cn

One of the problems of providing understandable examples in a book such as this (and in fact one of the problems of software in general) is that important points tend to get lost in a morass of detail. To prevent that from happening, some of the examples in this book will not be written entirely in C. Instead, they will be written in C!'.

Cl! is identical to C, except that wherever you put two exclamation points, the computer does whatever is described after those. exclamation points. For example:

if ( x 1- 0) {

i i Read timer value from the hardware

i i Do all the necessary ugly arithmetic y - ii Result of the ugly arithmetic

if (y > 197)

ii Turn ~n the warning light

If x is not zero, then this program does whatever is necessary to read the value from. the hardware timer. Then it does various necessary calculations and stores the result in y. If y is greater than 197, this prograrn turns on the warning. light.

Rest assured that we only use .the special feature of C!! for such things as hardware-dependent code or specific calculations needed tor some arcane application. The parts of the examples that pertain to the point being made are written in plain old vanilla c.'

1. It would be nice jf we could all write all or our problems in C!I, but unfortunately, the compiler for C'! is still under development.

XVlll ABOUT THIS BOOK

ABOUT THIS·I)()(lI<

XlX

Hungarian Variable Naming

Many of the code examples in this book use. a convention for naming variables called Hungarian. Here is a brief introduction to that convention, In Hungarian, the name of a variable contains information about its type. For example, names fori nt variables begin with the letter "i," names for byte (unsi gned char) variables begin with "by," and so on. In addition, certain prefixes on variable names indicate that a variable is a pointer ("p_"), an array (".a_:_"), and so on. Here are some typical variable names and what you can infer about the variables from their names:

/J.-C/OS Licensing Information

i1C/OS source and object code can be freely distributed (to students) by accredited colleges and universities without requiring a license, as long as there is no commercial application involved. In other words, no licensing is required if ftC/OS is used for educational use.

byError-a byte variable (tbat contains an error code, probably). i Tank-an integer (that contains the number of a tank, perhaps). p_ iTa n k """;t pointer to an integer.

a chPri nt--an array of characters (to be printed, most likely). fDone-a flag (indicating whether a process is done).

Hungarian rs popular because, even though the variable names are at first somewhat cryptic many believe that the little bit of information contained in the name makes coding a little easier.2

You must obtain an Object Code Distribution License to embed ftC/OS in a commercial product. This is a license to put ftC/OS in a product that is sold with the intent to make a profit. There will be a license fce tor such situations, and you need to contact Mr. Labrosse for pricing.

You must obtain a Source Code Distribution License to distribute ftC/OS source code. Again, there is a fee -for such a license, and you need to contact Mr. Labrosse for pricing.

You can contact Jean Labrosse at:

[email protected]

or

/J.-C/OS

Jean J. Labrosse

949 Crestview Circle Weston, FL 33327 USA

1-954-217-2036 (phone) 1-954-217-2037 (fax)

-------------------------------

When you get to the discussion of real-time operating systems in Chapter 6 and beyond, you might want to try your hand at using one of these systems. To help you do that, the CD that accompanies this book has one, named j.LC/OS (pronounced "micro-see-oh-es-,"). It is 011 the CD with the permission of Jean Labrosse, who wrote the system. You are free to use pC/OS as a study tool, but you may not use it for a commercial product without permission from Mr. Labrosse. pC/OS IS NOT "SHAREWARE." The licensing provisions for /.l.e/os are shown on page xix.

The mformation in Chapters 6 and 7 and in Table 11.2 should get you started uslllg this system. If you want more information about the system, see Mr. Labrosse's book, listed in the. section on Further Reading, or goto the j.LC/OS

Web site at www.ucos-ii.com. You can contact ]\III. Labrosse for support of I.l.C/OS, but please do not do this until you have checked the pC/OS Web site for the latest versions and fixes,

The example programs on the CD and /IC/OS are intended for usc with the Borland CIC++ compiler for DOS. You can get the "scholar edition" of this compiler (again, for study usc, not for use in developing a commercial product) for $49,93 as of this writing. Various retailers carry it, or you can contact Borland at www.borland.com.

2. The Hungarian used in this book is actually a dialect. not quite the original convention. The major differences between the .dialect and the original are (1) the original docs not use an underscore to separate a prefix from the rest of the variable name. and (2) the dialect uses the convention to name functions as well as variables.

1.1

A First Look at Embedded Systems

As microprocessors have become smaller and cheaper, more and more products have microprocessors "embedded" in them to make them "smart." Such products as VCRs, digital watches, elevators, automobile engines, thermostats, industrial control equipment, and scientificand medical instruments are driven by these microprocessors and their software, People use the term embedded system to mean any computer system hidden in any of these products,

Software for embedded systems must handle many problems beyond those found in application software for desktop or mainframe computers. Embedded systems often have several things to do at once. They must respond to external events (e.g., someone pushes an elevator button). They must cope with all unusual conditions without human intervention. Their work is subject to deadlines.

Examples of Embedded Systems

To understand the issues of embedded-systems software and to make the problems a little more concrete, let's start by examining a few sample systems. We'll look back at these examples from time to time as we discuss specific issues and specific solutions.

Telegraph

The first system that we will study is one that was code-named "Telegraph" during its development. Telegraph allows you to connect a printer that has only a high-speed serial port to a network. From the outside, Telegraph is a little

2

A FIRS r LOOK AT EMBEDDED SYSTEMS

Figure L 1 Telegrap h

Network connector

plastic box, 2 to 3 inches on a side and about half an inch thick. A pigtail cable on one side of the box plugs into the serial port on the printer. A connector on the other side of the box plugs into the network. A sketch of Telegraph is shown in Figure 1.1.1

Obviously, Telegraph must receive data from the network and copy it onto the serial port. However, Telegraph is rather more complicated than that. Here arc just a few things that Telegraph must do:

• On the network. data sometimes arrive out of order, data sometimes get lost along the way, and some of the data sometimes arrive twice. Telegraph must sort out the chaos on the network and provide a clean data stream to the printer.

• There might be lots of computers on the network, all of which might want to print at once. The printer expects to be plugged into a single computer. Telegraph must feed the printer one print job at a time and somehow hold off all the other computers.

I Network printers must provide status information to any computer on the network that requests it, even if they are busy printing a job for some other computer. The original, serial-port printer can't do that. Telegraph has to.

• Telegraph has to work with a number of different types of printers without customer configuratiol1;Telegraph has to be able to figure out the kind of printer to which it is attached.

1. Telegraph was built to work with Apple inkjer printers, which typically had a serial port that you could connect directly to a Macintosh computer. Its shape allows it to snap directly onto the back of one of these printers. Various versions of it worked with different networks.

3

1.1 EXAMPLES OF EMBEDDED SYSTEMS

I Telegraph must respond quite rapidly to certain events. There are, for example, various kinds of network frames to which Telegraph must send a response within 200 microseconds.

• Telegraph must keep track of time. For example, if a computer that has been sending print data to Telegraph crashes, Telegraph must eventually give up on that print job-perhaps after 2 minutes-and print from another computer on the network. Otherwise, one computer' crash would make the printer unavailable to everybody.

Telegraph Development Challenges

To satisfy the list of requirements given above, Telegraph has a microprocessor embedded in it. Its software is more extensive and sophisticated than its external appearance might lead you to believe. What problems arise in developing such software? Before reading on, you might consider writing down what you think these problems might be.

To begin with, of course, software for Telegraph must be logically correct.

It can't lose track of which computer is printing or drop data or report incorrect status. This is the same requirement placed on every piece of software in both the embedded and the applications arenas.

However, writing the software for Telegraph-s-like writing software for many other embedded systems-c-offers up a few additional challenges, which we shall now discuss.

Throughput

The printer can print only as fast as Telegraph can provide data to it. Telegraph must not become a bottleneck between the computers on the network and the printer. For the most part, the problem of getting more data through an embedded system is quite similar to that of getting an application to run faster. You solve it by clever programming: better searching and sorting, better numerical algorithms, data structures that are faster to parse, and so on. Although these techniques are beyond the scope of this book, we will discuss mistakes possible with real-time operating systems that will spoil your throughput.

Response

When a critical network frame arrives, 'Telegraph must respond within 200 microseconds, even if it is doing something else when the frame arrives. The

4

/\ Fm'T LOOK AT EMBEDDED SYSTEMS

software must be written to make this happen. We will discuss response cxtcnsively, because it is a common probJem in embedded systems and because all of the solutions represent compromises of various kinds.

People often use the relatively fuzzy word "speed." However, embedded system designers must deal with two separate problems-throughput and response+-and the techniques for dealing with the two are not at all the same. In facr.. dealing with one of these problems often tends to make the other one worse. Therefore, in this book we will stick to the terms throughput and response, and we will avoid speed.

Testability

It is not at all easy to determine whether Telegraph really works. The problem is that a lot of the software deals with uncommon events. Telegraph is typical of embedded systems in this regard, because these systems must be able to deal with anything without human intervention. For example, lots of the Telegraph code is dedicated to the problem that data might get lost on the network. However, data doesn't get lost very often, especially in a testing laboratory, where the network is probably set up perfectly, is made entirely of brand-new parts, and is all of 15 feetlong. This makes it hard to test all those lines of code.

Similarly, Telegraph must deal with events that are almost simultaneous. If two computers request to start their print jobs at exactly the same time, for example, does the software cope properly' Telegraph contains code to handle this situation, but how do you make it happen in order to test that code;

We will discuss testing problems.

Debugability

What do you think typically happens when testing uncovers a bug in the Telegraph software? Telegraph has no screen; no keyboard; no speaker; not even any little lights. When a bug crops up, you don't get any cute icons or message boxes anywhere. Instead, Telegraph typically just stops working. A bug in the network software? A bug in the software that keeps track of which computer is printing? A bug 111 the software that .reportsprinter status? Telegraph just stops working.

Unfortunately, having Telegraph stop working doesn't give you much information about a bug. Further, with no keyboard and screen you can't run a debugger on Telegraph. Youmust find other ways to figl1n: out what has happened. We will discuss techniques for.debugging embedded-systems software,

/ I

5

I.1 EXAMPLES OF EMBEDDED SYSTEMS

and we'll discus~ a few techniques for keeping some of the more difficult bugs from creeping into your software in the first place.

Reliability

Like most other embedded systems, Telegraph is not allowed to crash. Although customers seem to have some tolerance for desktop : systems that must be rebooted once in a while, nobody has any patience with little plastic boxes that crash. In particularly awkward situations, application software can put a message on the screen and ask the user what to do. Embedded systems do not have that option; whatever happens, the software must function without human intervention.

Memory Space

Telegraph has only a very finite amount of memory-specifically, 32 KB of memory for its program and 32 KB of memory for its data. This was as much memory as Telegraph could have ifits price were to be reasonable. Memory gets nothiIl~ but cheaper, but it still isn't free. Making software fit into the available space is a necessary skill for many embedded-system software engineers, and we'll discuss it.

Program Installation

The software in Telegraph didn't get there because somebody clicked a mouse on an icon. We will discuss the special tools that are needed to install the software into embedded systems.

Cordless Bar-Code Scanner

Let's turn to another embedded-systems example, a cordless bar-code scanner. Whenever its user pulls the trigger, the cordless bar-code scanner activates its laser to read the bar code and then sends the bar code across a radio link to the cash register. (See Figure 1.2.)

How do the problems of developing the software for the cordless bar-code scanner compare to those of developing the software in Telegraph?

Well, they're mostly the same. One problem that the cordless bar-code scanner docs not have is the problem of throughput. There just isn't very much data in a bar code, and the user can't pull the trigger that fast. On the other hand, the cordless bar-code scanner has one problem that Telegraph does not.

6

A I'rn sr LOOK AT EMBeDDED SYSTEMS

1.1 EXAMPLES OF EMBEDDED SYSTEMS

7

Figure 1.2 Cordless Bar-Code Scanner

Laser Printer

Radio sends bar code to cash register

Another embedded system is the laser printer. Most laser printers have fairly substantial nucroprocessors embedded in them to control all aspects of the printing. In particular, that microprocessor is responsible for getting data from the various communication ports on the printer, for sensing when the user presses a button 011 the control panel, for presenting messages to the user on the control panel display, for sensing paper jams and recovering appropriately, for noticing when the printer has run (Jut of paper, and so on.

But the largest responsibility of the microprocessor is to deal with the laser engine, which is that part of the printer responsible for putting black marks on the paper. The only thing .that a laser engine knows how to do without microprocessor assistance IS to put a black dot or not to put a black dot at each location on a piece of paper. It knows nothing about the shapes of letters, fonts, font sizes, italic, underlining, bold, or any of those other things that printer users take for granted. The microprocessor must read the input data and figure out where each black dot should go. This brings us to another problem found in some embedded systems.

trigger.

2. Laser reads bar code.

Power Consumption

Since the scanner is cordless, its battery is its only source of power, and since the scanner is intended to be handheld, the weight of the battery is limited by what an average user can comfortably hold IIp. How long does the customer want the battery to last? The obvious answer+forever-visn'r feasible. What is the next best answer?

The next best answer is that the battery should last for an 8-hour shift. After that, the scanner can go back into a holster on the side of the cash register for the night and recharge its battery. It turns out, however, that it also isn't feasible to run a laser, ,a microprocessor, a memory, and a radio for 8 hours on battery power. Therefore, one of the major headaches of this software is to figure out what parts of the hardware are not needed at anygivcn time and turn those parts off, That includes the processor, We'll discuss this, too.

Processor Hogs

Figuring out where the black dots go when a printer has been asked to print some text on a slanted line with an unusual font in a screwball size takes a lot of time, even for powerful microprocessors. Users expect a quick response when they push buttons, however; it is no concern of theirs that the microprocessor is busy figuring out values for trigonometric functions to discover where on the page the serifs of a rotated letter should go. Work that ties up the processor for long periods of time makes the response problem that much harder.

Underground Tank Monitor

The underground tank monitoring system watches the levels of gasoline in the underground tanks at a gas station. Its principal purpose is to detect leaks before the gas station turns into a toxic waste dump by mistake and to set off a loud alarm ifit discovers one. The system also has a. panel of 16 buttons, a 20-character liquid crystal display, and a thermal printer. With the buttons, the user can tell the system to display or print various information such as the gasoline levels in the tanks or the time of day or the overall system status.

8 A FIRST LOOK AT EMBEDDED SYSTEMS

To figure out how much gasoline is in one of the tanks, the system first reads the level of two floats ill the tank, one of which indicates the level of the gasoline and one of which indicates the level of the water that always accumulnrr-, in the bottom of such tanks. It also reads the temperature at' various levels in the tank; gasoline expands and contracts considerably with changes in temperature, and this must be accounted for. The system should not set off the alarm just because the gasoline cooled off and contracted, thereby lowering the float.

None of this would be particularly difficult, except for the problem of cost that often arises in the context of embedded systems.

Cost

A gas station owner buys one of these systems only because some government agency tells him he has to. Therefore, he wants it to be as inexpensive as possible. Therefore, the system will be built with an extremely inexpensive rmcrocontroller, probably one that barely knows how to add 8-bit numbers much less how to use the coefficient of expansion of gasoline in any efficient way. Therefore, the microprocessor in this system will find itself extraordinarily busy just calculating how much gasoline there really is down there; that calculation will turn into a processor hog.

A sketch of the underground tank monitor is in Figure 8.7. Figure 8.6 contains a more detailed description of what the underground tank monitor does.

Nuclear Reactor Monitor

Last, one very simple example from which we can learn a surprising amount is a hypothetical system that controls a nuclear reactor. Our hypothetical system must do many things. but the only aspect that will interest us is the part of the code that monitors two temperatures, which are always supposed to be equal. If they differ, it indicates that a malfunction in the reactor is sending it toward China. We'll revisit this system several times.

1 .. 2

Typical Hardware

If you know generally what kinds of hardware parts are typically found in embedded systems, youcan skip this section. Otherwise, read on for a summary of what usually inhabits one of these systems.

First, all of the systems need a microprocessor. The kinds of microprocessors used in embedded systems are quite varied, as you can see from Table 1.1, a list of some of the common 'microprocessor families and their characteristics.

1.2 TYPICAL HARDWARE 9
Table 1.1 Microprocessors Used in Embedded Systems
Largest
Bus External Internal Speed
Processor Width Memory Peripherals (MIPS)
Zilog Z8 family 8 None on some 2 timers
models; 64 KB
on others
Intel 8051 8 64 KB program 3 timers +
family + 64 KB data 1 serial port
Zilog Z80 8 64 KB; 1 MB, Various 2
family sort of
Intel 80188 8 1 ME 3 timers + 2
2 DMA
channels
Intel 80386 16 64MB 3 timers + 5
family 2 DMA
channels +
various others
Motorola 32 4 GB Varying 10
68000 family
Motorola 32 64MB Many 75
PowcrPC
family (Note that the semiconductor companies all sell a variety of models of each microprocessor. The data in Table 1.1 are typical of these microprocessor families; individual microprocessors may differ considerably from what is shown in Table 1.1.)

An embedded system needs memory for two purposes: to store its program and to store its data. Unlike a desktop system, in which programs and data are stored in the same memory, embedded systems use different memories for each of the two different purposes. Because the typical embedded system does not have a hard disk drive from which to load its program, the program must be stored in the memory, even when the power is turned off. As you are no doubt aware, the memory in a desktop system forgets everything when the power is turned off. The embedded system needs special kinds of memory that will

10

A FIRST LOOK AT E;--M~BE~'D~DE~'D-:S;--Y-ST~EM~S--------~' ----

CHAPTER SU1\lMAR'r'

11

remember the program, even with no power. Unfortunately, as we will discuss in Chapter 2, these special memories are not very suitable for data; therefore, embedded systems need some regular memory for that.

After a processor and memory, embedded systems are more noted for what they do not have than for what they do. Most embedded systems do not have the following:

• Reliability-Embedded systems must be able to handle any situation without human intervention.

• Memory Spac!'--Memory is limited on embedded systems, and you must make the software and the data fit into whatever memory exists,

• Program Installation-You will need special tools to get your software into e111 bedded systems,

• Power Consumption-Portable systems must run on battery power, and the software in these systems must conserve power.

• Processor Hogs-Computing that requires large amounts of CPU time can complicate the response problem.

• Cost-Reducing the cost of the hardware is a concern in many embedded system projects; software often operates on hardware that is barely adequate for the job.

I Embedded systems have a microprocessor and a memory. Some have a serial port or a network connection, They usually do not have keyboards, screens, or disk drives,

I A keyboard. Some systems may have a few push buttons for user input; some-Telegraph, for example--do not have even that.

• A screen. Many systems, especially in consumer products, will have a liquid crystal display with two or three dozen characters. A laser printer, for example, commonly has a two-line status display with 10 or 12 characters on each line. Other systems do not even have this much output capability. Some may just have

few light-emitting diodes (those tiny lights you sometimes see on systems) to indicate certain basic system functions,

II A disk drive. The program is stored in the memory, and most embedded systems do not need to store much data on a permanent basis. Those that do typically use various kinds of specialized memory devices rather than disk drives.

I Compact discs, speakers, microphones, diskettes, moderns. Most embedded systems have no need for any of these items.

What embedded systems very often do have are a standard serial port, a network interface, and hardware to interact with sensors and activators on equipment that the system is controlling.

Chapter Summary

I An embedded system is any computer system hidden inside a product other than a computer.

I You will encounter a number of difficulties when you write embedded-system software in addition to those you encounter when you write applications:

• 1hroughput- Your system may need to handle a lot of data in a short period

of time.

• Response-Your system may need to react to events quickly,

• Testability-Setting up equipment to test embedded software can be difficult.

• Debugability-Without a screen or a keyboard, finding out what the software is doing wrong (other than not working) is a troublesome problem.

2.t

Hardware Fundamentals for the Software Engineer

If voure Cllnilial" with schemaucs, you can this

chapter :nld the next. (If you are and want to get on with the

software, you also can skip these two chapters, but jf you know nothing about hardware, you mavend up having to peek back at them to understand some or the material in the later chapters.)

Although a software engineer who writes only entire career and learn nothing about hardware, au

c'1gineer usually funs up

on. The embedded-systems

hardware

software engineer mnst often understand the hardware In order to write correct software; must install the- soft. ware on the hardware; must sometimes figure out whether a problem is software bug or by something wrong in the hardware: may ever; be responsible for reading the hardware schematic diagram and ::\uggesting corrections,

In this chapter arid the next we will dISCUSS the basics of digital hardware.

These chaprcrs WIll you WIth intor marion to read the schematic

for a tYPIC'! embedded system well (0 be able to write rhe

sotiwarc and t;\lk intelhgcntly to the hardware There is not nearly

enough information here for you to start

Terminology

Some Very Basic

Most digital electronic circuits are built with semiconductor parts called chips that are purchased trc.m manufacturers specializing in building such parts.

14

2.1 TERMINOLOGY

15

Figure 2.1 Various Types of Chip Packages

Some More' Basic Terms

Most digital circuits use just two voltages to do their work:

o

I 0 volts, sometimes called ground or low.

II Either 3 volts or 5 volts, sometimes called VCC (which stands for Voltage Connected to Collector) or high.!

Dual Inhne Package (DIP)

Plastic Leaded Chip Carrier (PLCC)

At any given time, every signal in the circuit is at one of these two voltages (although there is some very short transition time from one voltage to another). For most circuit components, voltages within a volt or so of high or low are good enough. For example, "low" might be from 0 volts to 1 volt; "high," from 3 to 5 volts; and from 1 volt to 3 volts might not be allowed. The entire activity of the circuit consists of the changing of the signals from high to low and from low to high as the various parts in the circuit signal one another.

With a few exceptions, whenever signals represent data or addresses, the low voltage represents a 0, and the high voltage represents a 1. In addition to data and addresses, however, every circuit contains many signals whose purposes are to indicate various conditions, such as "reset the microprocessor" or "get data from this memory chip." These signals are said to be asserted when they are signaling whatever it is that they signal. For example, when the microprocessor wants to get data from a particular memory chip, the engineer must design the circuit to assert the "get data from this memory chip" signal. Some signals are asserted when they are high, and some are asserted when they are low.2 You must read the schematic and the information in the data sheets about the parts in the circuit to determine which are which,

Most hardware' engineers will assign a name to each signal in the circuit.

For example, the data signals might be named DO, D1, D2, and so on. The address signals might be AO, A1, A2, and so on. The signal that indicates "read from memory now" might be named MEMREAD. Many careful engineers will give a special' name to each signal that is asserted when it is low by starting or ending the name with an asterisk (*), ending the name with a slash (I), or

Thin Small Outline Package (TSOP)

Plastic Quad Flat Pack (PQFP)

The semiconductors themselves are encased in small, thin, square or rectangular black packages made of plastic or ceramics. To attach the semiconductors to the outside world, each package has a collection of pins, stiff metallic legs that protrude from the sides of the package. Depending upon the type of the part, there may be from 8 to 500 pins. (See Figure 2.1.) The chip manufacturers provide information about each of their products in documents called data sheets.

The most common mechanism to connect the chips to one another is the printed circuit board or board, a thin board typically made out of fiberglass with the required connections printed on it in copper. The chips are soldered to the appropriate locations on the board after it has been manufactured. Companies building embedded-system products typically must design their own boards, although many of them will subcontract their manufacture and assembly to companies that specialize in tins.

Hardware engineers record their designs by drawing schematic diagrams, drawings that show each part needed in the circuit and the interconnections needed among them. An example of one is shown in Figure 3.20. In this chapter and in Chapter 3 we will discuss some of the symbols and conventions that are used on schematics. Note that schematic diagrams are not layouts showing where the parts are located on the board (although many schematics will contain notations to indicate where the parts are).

1. If the parts in your system have been built using metal oxide semiconductor (MOS) technology, you'll sometimes hear the term VDD instead ofVCC and VSS instead of ground. 2. The electronic properties of the semiconductor materials from which chips are made makes it "natural," from the perspective of the chip designer, for certain signals to be asserted high and others low. Also, as we'll see later when we cxamme "pen collector devices in Section 2.3, high and low have somewhat different properties when you connect chips to one another.

16

J--li\\{D\'VAHE 1;UN1)A,\1ENTAT,S FOR THE SOPTWAR"C EJ\iGTNEF.R

by putting a bar over the name. For example, a signal named MEMREl\D/ or *MEMREAD would most likely be a signal that is set low to read from memory.

Chips have connections through which they expect to control the

level on the attached signal-.outputs-and other connections through which thcv expect to sense the voltage level on the attached Most signals ,UT connected to the output of just one part in the circuit; each may be connected to the inputs of several parts in the circuit, however. The part whose output controls the voltage on a given signal is said to drive the signal. If no part on the circuit is driving a signal, then that signal is said to be floating. Its

will be indeterminate and may change time passes. The results of a

floating signal vary between harmless and disastrous. depending upon how the pam. with inputs connected to the floating signal cope WIth the problem.

If two parts drive the same signal at tbe same time, things work pretty well as the two part.s both drive Illgh or both drive low. If one tries to drive one way and the other tries to drive the other, then the usual result is to destroy one (or both) of the parts. Usually the parts get very hot--hot enough to raise a blister on your thumb if YOll touch one of thcm--then they stop working for good. This is sometimes called a bus fight. Bus fights that last only a short time-say several nanoseconds-e-but that occur periodically, may not destroy the parts, but may cause the circuit to run unreliably and to become less reliable as time goes by. Bus fights invariably indicate an error in the hardware design,

Figure 2.2 AND Gate

17

2.2 GATES

Figure 2.3 Multiple-Input AND Gates

You can also get AND gates with three or even more inputs, as shown in Figure 2.3. The outputs of these gates are high if all of the inputs are high.

Figure 2.4 shows the symbol for an OR gate. An OR gate is one whose output (again, shown at the right of the figure) is driven high if either or both of the inputs are high and whose output is driven low only if both inputs are low. As with AND gates, YOll will find circuits with multiple-input OR gates.

Figure 2.5 shows the symbol for an XOR gate or exclusive OR gate. An XOR gate is one whose output (again, shown at the right of the figure) is driven high if one but not both of the inputs are high and whose output is driven low if both inputs are high or both are low.

Figure 2.6 shows the symbol for an inverter. Inverters are very simple: the output is driven low if the input is high and vice versa.

Figure 2.4 OR Gate

2 .. 2

Gates

A very simple part built from just a handful of semiconductor transistors is called a gate, or sometimes a discrete. Tn this section we cover S0111e of the very basic gates usce1 in typical digital hardware circuits. Although you can buy parts that contain Just one gate each. chips that contain three, four, or even five or six of these vcry simple circuit elements are the norm.

Inverters, AND Gates, and OR Gates

Figure 2.2 shows the symbol that hardware engineers place on their schematic to indicate an AND gate. An AND gate is one whose output (shown at the right of the figure) is driven higb if hoth of the mputs are high and whose output is driven low iieither input IS low or itboth inputs arc low. The table in Figure 2.2 shows this.

--1--)-

-=l .

---L_

18

HARDWARE FUNDAMENTALS FOR THE SOFTWARE ENGINEER

Figure 2.5 XOR Gate

Figure 2.6 Inverter

Figure 2.7 NAND Gate

The Bubble

19

2.2 GATES

Figure 2.8 OR Gate with Negated Inputs

Input 1 Input 2 Output
High High Low
High Low High
Low High High
Low Low Low Input 1 Input 2 Output
High High Low
High Low High
Low High High
Low Low High Figure 2.9 Another Inverter

Input 1 Input 2 Output
High High Low
High Low High
Low High High
Low Low High a driver, a device that drives its output signal to match its input signal. The bubble indicates that the gate also inverts; for an inverter, it makes no difference whether the bubble is shown on the input or the output.

When the circuit is actually built, of course, the manufacturer will use the same sort of inverter, regardless of which symbol is on the schematic. Why would an engineer favor one symbol over another? Some engineers follow the convention that a signal that goes into or comes out of a bubble is one that is asserted low. These engineers might use the symbol in Figure 2.9 if the input signal (on the left) is one that asserts low and would use the one in Figure 2.6 if the input signal is one that asserts high. (Note, however, that this is not the only convention and that many engineers will use the symbol in Figure 2.6 consistently, regardless of which signals assert high or low. Note also that some engineers are careless about this in any case.)

In a similar vein, a NAND gate and an OR gate with inverted inputs also are identical. You can convince yourself of this by reviewing the truth tables in Figure 2.7 and Figure 2.8: they're the same. As with the inverter, the same part will go into the circuit, no matter which symbol is on the schematic. Many engineers will use the symbol for the OR gate with inverted inputs if the underlying operation is more 'or-like ("I want the output to assert if input 1 is asserted or input 2 is asserted, Ignoring the issues oflow-asserting signals") and will use the NAND symbol if the underlying operation is more 'and' -like. Again, however, this is not the only convention.

The little loop, or bubble, on the inverter symbol is used in other schematic symbols to indicate that an input or an output is inverted (low when it would otherwise be high and vice versa). For example, the symbol in Figure 2.7 is the one for a gate whose output is the opposite of an AND gate: its output is low if both inputs are high and is high otherwise. This gate is called a not-AND gate Of, more often, a NAND gate.

The bubble can be used for the inputs on a gate as well (see Figure 2.8).

The operation of this gate is to invert each input, and then to feed the inverted inputs to a regular OR gate.

Occasionally, you'll even see the symbol shown in Figure 2.9. It's just the same as the inverter we saw before. The triangular part of this symbol represents

FUN >AAlfNI'ALS FOR 1'Hf: SOPT\VARF. ENGIl'.'EER

Figure 2.10 Another Circuit

You can invent schematic

with bubbles on some inputs but not

hut the is that no manufacturer nukes parts that correspond to

See Figure 2.10.

.. 3

Other Basic Considerations

Decoupling

for the most part. the problems of providing power to a circuit are beyond the scope of this book. However, there are several useful things to know.

The first thing to know is tint, with very few exceptions, each chip in any circuit has a power pin (sometimes called a vee pin), which must be connected to that is lugh (at Vee), and a ground pin, which must be connected to a sIgnal that is always low. These two are in addition to the pins for the various input and output signals, and they provide power to run the part itself For example, the standard "7400" part has four NAND gates in it. Each NAND g;lte has two inputs and one output, for a total of12 connections. The 7400 package has 14 pins: the 12 signal connections plus a power pin and a ground pin. The connections to the power pm and the ground pin usually do not appear Oil circuit schematics. but they must be made for the circuit to work. In fact, one common test when a circuit isn't running is to use a voltage meter to ensure that power and ground are connected as required to each part.

When it is necessary w show vee and ground connections on a schematic, engineers use the symbols shown in Figure 2.11.

One problem that hardware engineers must solve is that most chips use much more power at some times than at others. Typically, if a chip must change many of its output signals fiorn high to low or hom low to high at the same time, that chip will need a lot of power to change these signals. In fact, they need more power than the skinny conductors on the average board can provide quickly. Unless you take steEs to com hat this problem, you end up with what amounts to a localized brownout for a few microseconds. Most types of chips

--_._------ -------- ----- __ . ----------~

2.3 A FEW 0 rm.n BASIC CONSTDERATIONS

21

Figure 2.11 Power and Ground Symbols

I ;;S

vee

Ground

Figure 2.12

~--

stop working temporarily if rhe voltage drops by 10 percent or so, even for a few microseconds, SO circuit; subject to brownouts fail often. 'fa deal with this, engineers add capacitors to the circuit, with one end of the connected to the signal providing power and the other to the signal providing ground. A capacitor is a device that stores a small amount of electricity, like a minuscule rechargeable battery.

It- some part in the VICinity of a capacitor suddenly needs a lot of power, and the voltage begins to fall because of that, the capacitor will give up its stored electricity to maintain the voltage level. At other times, the capacitor quietly recharges itself. LA, capacitor can smooth over brownouts that last up to a iew microseconds, enough to take care of the voltage drops caused when other parts of the circuitry suddenly demand a lot of power.

A capacitor used this way is called a decoupling capacitor. Decouplmg capacitors are usually scattered around the circuit, since they need close proximity to the parts needing power to do their work effectively. They are often shown on the schematic. The symbol used for a capacitor is shown in Figure 2.12. On many schcmaucs, you'll see something Eke the diagram shown in fIgure 2.13, which indicadcs that a collection of decoupling capacitors needs to he placed ill the circuit.

Open Collector and Tri-Stating Outputs

One special class of outputs, the open collectoroutputs. allows you to attach the outputs of several devices together to drive a single signal. Unlike the usual outputs, which drive signals bigh or drive them low, the open collector outputs

22

2.3 A FEW OTHER

CONSlDERA nONS

23

HARDW,\RE FUNDAMENT:\LS FOR THE SOFTWARE

Figure 2.13 Decoupling Capacitors

Figure 2.14 Open Collector Outputs

t-I---:r-r-=r-J=--_r-..r-J_ I L _ _]- -1- ]~ T .T T -r

L_I_-_L I J_ _L-:rml I

-! -T_-T ]-- -~[__I___..T~---4-T_·- _.~

I

~L

Open collector

Pullup resistor

can drive their outputs low Of let them float. With open collector outputs, there is no such thing as a bus fight. If several open collector outputs are attached to the same signal, then the signal goes low if ally of the outputs is driving low.

Figure 2.14 shows how you might use devices with open collector outputs.

If your microprocessor has only one input for an interrupt signal but you have two devices that need to signal interrupts, and if the following two conditions hold, then the circuit in Figure 2.14 will work.

• The interrupt input on the microprocessor is asserted when it is low.

II The interrupt outputs on the two devices are asserted when they are low and thev are hotb open collector outputs.

(We'll discuss interrupts further in Section 3.4 and in Chapter 4; here we'll just explain how the circuit works.) If one of the devices wants to signal the interrupt, then it drives its interrupt output low, the signal INTI will go low, and the microprocessor will respond to the interrupt signal. (Then it's a small matter of software to figure out which device signaled the interrupt.) If neither device wants to signal the interrupt, each will let its output float, the pullup resistor will cause INTI to go high, and the microprocessor will sense its interrupt input as not asserted.

Note that the pullnp resistor is necessary for this circuit to work; otherwise, INT I would float when neither device wanted to interrupt. The pullup resistor ensures thatINT I goes high in this situation. Note also that you cannot omit the resistor and connect the INTI signal directly to vee. If you did this, then you would have a bus fight on your hands as soon as one of the devices tried

----1 ! INTI l j

[lP 21)T--+---------l--GL_-~

! I Another <.)per ~'1icroprocessor

i _j collector interrupt interrupt input

L_,_ output that asserts 10\V that assert" low

Microprocessor

to drive INTI low, since the parts that provide electrical power to your circuit would then try to keep INT'I high. The resistor provides a necessary buffer to prevent this bus fight, See the following section on Floating Signals (or Multiply Driven Signals) for more discussion of pullup resistors.

Standard parts either drive their output signals high or drive them low, Open collector outputs drive their signals low orlet them float. Another class of outputs can drive signals high, drive them low, or let them float. Since letting the output signals float is a third possible state (after driving them high or driving them low), these outputs arc usually called the tri-state outputs, and letting the signals float is called tri-stating or going into the high impedance state. Tri-state devices arc useful when von want to allow more than one device to drive the same

signal. ' ('

The circuit shown in Figure 2.15 shows a simple use for tri-state devices.

The triangular symbol in that schematic is a tri-state driver. A tri-state driver works like this: when the select is asserted, then the tri-state driver output will be the same as the input; when the select signal is not asserted, tbe output on [be tri-state driver floats. In the circuit in f'igure 2.15,ifSELECT A is asserted

24

--_-

HARDWAHE-Fu·"D;';;:-~~LS r-;;~;;;E SOFTWARE ENGI.'J~·~\-----------·

~.j A FEW OTHER BASIC CONSIDERATIONS

Schernatic syrnbol for a tri-state driver

memory tri-state their outputs, and the microprocessor can drive its data onto the data signals. Whenever one of the devices wants to send data, the other t\VO tri-state their outputl while the sending device drives the data signals.

Figure 2.15, bv the way, illustrates a common convention used on schematic diagrams: the dot. Where the three tri-state driver outputs intersect, the solid black dot indicates that these signals arc to be connected to one another. The usual convention on schematics is that two crossing lines on a schematic arc nat connected without a dot. So, for example, in Figure 2.15 the INPUT .foe signal is not attached to SELECT B or SELECT C, nor is INPUT B connected to SELECT C, even though the lines representing these signals cross one another on the left hand side of the Figure 2.15 schematic.

Figure 2,15 lI. Circuit Using Tri-State Drivers

SELECT A

SELECTB

',ELECT C

--------'·---l

I

INPUT A I

--- ~._.L

INPUT II

Floating Signals (or Multiply Driven Signals)

OUTPUT

The circuit in Figure 2.15 has two potential problems. First: what happens if none of the select signals is asserted? In this case, none of the drivers drives the OUTPUT signal, and that signal floats. \~lhether it is high OT low or somewhere in between is indeterminate, depending upon transient conditions in the drivers and in the parts that are sensing the If the circuit's function depends upon other parts sensing this signal as high or low, then the behavior of the entire circuit may become random ..

The usual solution to this sort ofpro blcm is to put a part on the circuit that drives it high or low by default. Figure 2.16 shows the same circuit as shown in Figure 2.15, but with an added pullup resistor, a resistor with one end connected to VCC and one end connected to a signal. When none of the select

INPUT C

and S.ELECT B and SELECT C are not, then the tri-state driver A can drive the OUTPUT signal high or low; tri-state drivers B and C do not drive it. The OUTPUT signal will reflect the input of the driver for which the select signal is asserted. Note that you can get tri-state drivel'S whose select signals assert high and others whose select SIgnals assert low.

Consider this extremely common situation: in YOllI circuit a microprocessor, a memory chip, and some I/O device must send bytes to one another. You could have multiple sets of data signals: eight signals tor the memory to send bvtes to the microprocessor, eight more for the microprocessor to send bvtes to the I/O. and so on. However, if all the devices can tri-state their data output signals, you can use Just one collection of data signals to interconnect all of them. When the microprocessor wants to,.selld data to the memory, the I/O device and the

Figure 2.16 A Circuit With a Pullup

26

.HARDWARE r:U,NDAY1.bNTALS FOR THE SO'FTU/.'\HE ENGT"'\F,ER

27

lines is asserted and none of the drivers drives the OUTPUT signal, enough electrical current will flow through the resistor to drive the voltage high. \~)"hen one of the drivers drives the OUTPUT signal low, current still flows through the resistor, but not enough to raise the voltage enough to matter. As is apparent, you could just as well attach the resistor. to ground, and the OUTPUT signal would go low if none of the drivers drive it. In this case the resistor would be called a pulldown resistor.

The second problem arises if more than one of the select signals is asserted and therefore more than one of the drivers drive the output signal. Unlike open-collector devices, tri-state devices canand will have bus fights ifone of them tries to drive J signal low and another tries simultaneously to drive that signal high. Tri-state devices can overheat and burn up in bus fights just like regular parts. If the software controls the select signals in Figure 2.16, you must ensure that no of the select lines are ever asserted simultaneously. If hardware controls them, then the hardware engineer must ensure this.

Pi gure 2.17 An Overloaded Circuit

Signal Loading

Examine the circuit in Figure 2.17, particularly the OVERLOADED signal, and look for a potential problem.

The problem is that the output signal from the inverter in the lower left corner of the figure is connected to the input signals of an awful lot of other parts. Any kind of part-s-the inverter that drives OVERLOADED as well as dny other--- em drive only a limited amount of electrical current on its various output signals. Each of the inputs attached to OVERLOADED absorbs a certain amount of cur rem in the process of detecting whether the signal is.high or low. If the inputs attached to OVERLOADED absorb more current than the inverter can drive onto OVERLOADED, the circuit won't work. This is the loading problem.

Manufacturers provide data about each part that indicates how much current it em drive out of its outputs and how much current is required for its inputs. I Lrdware must ensure that the outputs driving each signal in the crrcurr can generate enough current to make all of the inputs happy. As a software engineer, you should not have to worry about this problem, but you'll occasionally see peculiar things on the schematics of your systems that will turn out to be solutions to this problem. It is useful to be familiar with the common

solutions so as not to be puzzled them when you encounter them.

One common SOlution to the loading is shown in Figure 2.18. The

added part in that figure is called a driver. Irs output is the S;Ul1e as its input.

Figure 2.18 A Circuit Not Overloaded Anymore

28

IiARDWARE FU:-!DAMbNTALS FOR THE SOFTWARF ENGINEER

~----------------~----~---

2-4 TIMING DIACHAMS

29

Figure 2.19 Another Circuit That's Not Overloaded

----~-

Figure 2.20 A Si'mplc Timing Diagram tor a NAND Gate

Input 1 ----r--\ _

I p--Output

Input 2 -----L __ //

, ,

, ,

---"-~--i-- --\-----------1--1------- --.----~-.------

: I I :

I' I

input 2 ~--.--: _r- .. ----;-. ~-----\ : r=r=>:

'-_____ " ~ ____J

, ,

, ,

,

: _j-·-----------T\__j_J---~-T--·---\_ _

Output ~~-, ,__,

I I I I

Input 1

However, the driver's input uses less current from OVERLOADED than does the sum of the parts to the right of the driver, so it has relieved the load on the inverter. The driver essentially boosts the signal. The only potential difficulty with this solution is that the driver invariably introduces at least a little delay into the signal.

A second possible solution is shown in Figure 2.19.

timing diagrams for them, we'll examine the one in Figure 2.20 to get our feet wet.

In the figure you can see that whenever one of the inputs goes low, the output goes high just a little later. In a real timing diagram from a chip manufacturer. there also would be indications of how much time elapses between when the inputs change and when the output changes. This amount of time is called the propagation delay. For a NAND gate, that time would be just a few nanoseconds, but part of the hardware engineer's job is to make sure that the nanoseconds don't add up to a signal arriving late at its destination.

D Flip-Flops

2 .. 4

Timing Diagrams

So far, all of the parts that we have discussed have depended only upon the levels of the input signals, that is, whether those signals are high or low. Other parts depend upon edges in the signals--tre transitions of the signals from high to low and vice versa. The transition of a signal from low to high is called a rising edge. The transition of a signal from high to low is called a falling edge.

In Figure 2.21 are two schematic symbols for a D flip-flop, sometimes known as a register, sometimes called a D-flop or a flip-flop, or even a flop. The Q output 011 the D Hip-flop takes on the value of die D input at the time that the eLK inputiransitionsirom iow to high, that is, at the CLK signal's rising edge. Then the Q OUtput holds that value (no matter what tbe ]) input does) until the CLK

Nothing happens instantaneously in the world of digital circuits, and one of the tools that parts manufacturers lise to communicate the characteristics of the parts to engineers is a timing diagram. A til1111lg diagram is a graph that shows the passage. of time on the horizontal axis and shows each of the input and Output SIgnals changing and the relationship of the changes to one another. Although NAND gates are so simple that manufacturers don't normally publish

30

2-4 Tr:v!ING DIAGRAMS

31

Figure 2.21 D Flip-Flop

Figure 2.22 Timing Diagram for a D Flip-Flop

stant is called the hold tune. The timing diagram for a D flip-flop indicates the minimum required for these two times (probably just a few nanoseconds). The timing diagram also indicates the maximum amount of time, called the clockto-Q time, after the rising edge of CLK before the Q output is guaranteed to be valid. Sometimes this amount of time is different, depending upon whether Q is going high or going low, Note that the terms setup time, hold time, and clock-to-Q time are used tor all kinds of parts, even for parts with no signal caned Q.

In the timing diagram in Figure 2.22, the shaded area of the D signal indicates a time period during which it does not. matter what the input does. Timing diagrams often use this convention. Note also that Figure 2.22 shows two complete timing cycles, each with a rising edge on CIK: the one on the left in which D is high and Q changes to high, and the one on the right in which D is

PRESET/

1----'1

. I Q

-.--. r-'

I

c,x t_ f~

D

Q

--0

CLEAR! I

D

--; / /

CLK

is driven low again and then high again. The Q/ signal is, as its name implies, the inverse of the Q signal, Some D flip-Hops also have a CLEAR/ signal and a PFlESETI signal. On those parts, asserting the CLEAR/ signal forces the Q signal low, no matter what the CLK and D signals are doing; asserting the PRESET / signal forces the Q signal high.

A D flip-flop is essentially a I-bit memory. Its Q output remembers the state of the D input at the time that the CLK input rises. A similar part, caned a latch, also can be used as a memory. A latch is the same as a D flip-flop in that it captures the state of the D input on the rising edge of the CLK input. However, the Q output in a latch is driven to be the same as the D input whenever the CLK input is low, whereas the Q output in a D flip-flop does not change until the rising edge of the CLK input.

Q

Q!

Hold Time and Setup Time

D flip-flops have more interesting timing diagrams, because the timing relationship between the two inputs is critical. (See Figure 2.22,) At the rising edge of the CLK signal, the Q output signal will take on the value of the D input signal. However, there is a minimum amount of time, both before and after the rising edge of the eLK signal, during which the D input must remain constant for the D flip-flop to run reliably. The time before the rising edge of cue during which the D input mustremain constant is called the setup time. The time after the rising edge ofCLK during which the D input must remain COIl-

1----1

D i Q

J ~I-

t J

~-~Irv

I I T-~-

I _ !/I _

L-------V I

-'--.--.---~l

I

-----.---------

Clock-to-O Time

32

FUNDAME'''TALS ['OR THE SOFTWARE ENGINEER

2. S IViEMORY

33

Figure 2.23 A Clock Signal

The second consideration in pi eking a frequency for an oscillator or

IS that it is often desirable to have the clock signal freSllency be an integer multiple of the data rate on your network or serial port or other communications medium. It's a lot easier to divide the clock signal by an integer to create another signal at the correct data rate than it is to divide by some fraction. (It's even easier to divide by some power of two. if you can get away with that.)

low and Q changes to low. Each will have a setup time, a hold time, and a clock-to-Q time, but for clarity some of the tunes arc shown on one cycle and some on the other. This is also a common timing diagram convention.

2 .. 5 Memory

Clocks

In this section, we'll discuss the memory parts typically found in an embeddedsystem circuit. Memories of' all kinds are sold in a variety of widths, sizes, and speeds. For example, a "8 x 512 KB 70 nanosecond memory" is one that has 512 KB3 storage locations of 8 bits each that can respond to requests for data within 70 nanoseconds. After you decide what kind of memory is useful for your system, you buy the size and speed that you need.

Obviously, for a circuit to do anything interesting, the levels on the signals have to change, Some embedded-system products do tbings only when external events cause a change Or! one of the inputs, but many circuits Deed to do things just because time is going by. For example, a microprocessor .. based circuit must go on executing instructions even if nothing changes in the outside world. To accomplish this, most circuits have a signal caned the clock. The timing diagram for the clock is very simple and is shown in Figure 2.23.

The purpose of the clock signal is to provide rising and falling edges to make other parts of the circuit do their jobs.

The two types of parts used to generate clock signals are oscillators and crystals. An oscillator is a pare that generates a dock signal an by itself Oscillators typically come in metallic packages with four pms: one for vee, one for ground, one that outputs the clock signal, and one that is there just to make it easier to solder the oscillator securely onto the printed circuit board. A crystal has just two signal connections, and you must build a little circuit around it to get a clock signal out. Many microprocessors have two pins on them for attachment to a circuit containing a

You can buy oscillators and crystals 1Il a wide range offrequencies. In picking a frequencv consider first that since other parts in the circuit must react to the clock signal, the clock signal must be slow enough that the other parts' timing requirements are met. For example, when you buv a microprocessor that is the '16-megahertz model, this means that that microprocessor will work with a clock signal that is 16 megahertz.(16 million cycles per second), but not with one that is faster. (Note, however, that microprocessors frequently need a crystal that oscillates at some multiple. of the actual clock speed that the ymcroprocessor uses.)

Read-Only Memory

Almost every computer system needs a memory area in which to store the instructions of its program. This must be a nonvolatile memory, that is, one that does not forget its data when the power is turned off In most embedded systems, which do not have a disk drive or other storage medium, the entire program must be in that memory, In a desktop system, enough program must be in memory to start up the processor and read the rest of the program from a disk or a network.

Most computer systems use some variant of Read-Only Memory, or ROM (pronounced "rahm," just like you would expect) for this purpose. The characteristics of ROM are the following.,"

m The microprocessor can read the program instructions from the ROM quickly, typically as fast as the microprocessor can run the program.

I The microprocessor cannot write new data to the ROI·,1; the data is unchangeable.

N The ROM remembers the data, even if the power is turned off

3, For memory sizes, KB invariably means 1,024. Therefore, tor example, 512 KB means 512 x 1,024, or 52'+,288.

34

HARDWARE FCNDAMENTALS FOR THE SOFrWARE E"CINEER

35

Figure 2.24

ROM Chip Schematic Symbol

Figure 2.25 Timing Diagram for a Typical ROM

An

I

h.O-.An ----~~----1 .. -.---- -.---

AD DO
r\ 1 Dl
A2 D2 DO-Dn .--.-.-------+

cs.

Dn f-----

REI

,-----~------_._/

I -+-----_/

- I ~_/---~

~ans to data valid

~- ~:d~ess valid to data valid

Figure 2.25 shows the timing diagram for a typical ROM. This timing diagram illustrates several other conventions often used in timing diagrams. With parts such as memory chips, which have multiple address or data signals, it is common to show a group of such rel:ltdd signals on a single row of the timing diagram. With such a group of signals, a single line that is neither high nor low indicates that the signals are floating or changing. When signals take on a particular value, that is shown in the timing diagram with two lines, one high and one low, to indicate that each of the signals has been driven either high or low

The expected sequence of events when a microprocessor reads from a ROM

is as follows:

\lVhcn the POWCl firsr turned on, the microprocessor will start fetching the program from (he ROM.

2.:24 shows the pins you find on a ROJ'vl part. The signals

from AD to An are the address signals, which indicate the address from which the processor wants to read. The Humber of these depends upon the size of tbeRC)!\;1. (Y;)U need more address lines to select a particular address in a larger RClJ\1.) The signals from DO to Dn are the data signals driven by the ROM. There are typically eigbt or sixteen of these. The CEI signal is the chip enable signal, which tells the ROJ'vl that the microprocessor wants to activate (he ROM. It is sometimes called the chip select signal. The ROM ignores the address signals unless the chip enable signal is asserted, The REI signal is the read enable signal, which indicates that the ROM should drive its data on the DO to Dn signals. The read enable signal is often called output enable, or OEI, instead. Unless both CEI and REI are asserted, the ROM tri-statcs its output data signals.

Although it may seem redundant, it is normal for ROM parts to have both a chip enable signal and a read enable signal. The purpose for this will become apparent in the next chapter, when we discuss bus architectures .. Note that it is very common forthcse enable signals to be asserted low.

I The microprocessor drives the address lines with the address of the location it

wants to fetch from the ROM.

I At about the same time, the chip enable signal is asserted, m A little while later the microprocessor asserts the read line.

• After a propagationde1ay, the ROM drives the data onto the data lines for the nucroproccssor to read.

~ \Vllcn the microprocessor has seen the data on the data lines (an event not shown on this timing diagram; that event would appear on the microprocessor's timing

36

.. it the chip enable and read enable lines, and the ROM stops

dnvlJlg the data onto the data lines.

MostROM chips also can handle ;1 cycle in which the read enable line is asserted nrst and the chip enable line is asserted second, but they often respond much r~()re slowly in this situation. The typical critical times for a ROM chip are the tollowmg:

How long is It between the time when the address is valid and the chip enable signal IS asserted and the time when the data signals driven by the ROM are valid'

How is it between the time when read enable is asserted and the time when

the data signals driven the ROIvI are valid;

ROM Variants

All sorts ofR01'vls are available. The data in a first kind of ROM is written into it

at the semiconductor when that ROM is built- it can new l: 1 '

'" . ~ - c_ "_ , ~,. \..~, _ • __ vLT le C langeo.

::'0111(" people usc the term masked ROIV1 for this sort of ROM; others just call it ROM.

The nrxr kind Programmable Read-Only Memory, or PROM. PH.OMs are . blank fro111 the and you can wrrte a program into them in your ofiicc with ,i PRC)M programmer or PROM burner, ;'tool made tor that purpose. It takes a m.itter oLeconds to write a program into a PROIv1. but you em write into a PROM once. If a program in a PROM has a mistake. you throw thePl-t.()[vl dway. fix the program, and write the new program into a new PROM. PROM programmers are relatively inexpensive, sellingfor as Ettie as $100.

The next variant is Erasable Programmable Read-Only Memory, or EPROM C'ee-prahm"). EPROM, are like PROMs, except that you can erase them and reu«- them. The usual way to erase an EPROM is to shine a stronz ultraviolet hght mto a window on the top of the chip. EPROM erasers, boxes WIth ulrravio ler lights 111 them, are also widclv .iv.ul.ibl« .md inexpensive. The

only... ttung about an FPROtv1 eraser is th.rt it must be designed to

keep you I:r0111 into the ultraviolet light I11hl:lkc and vour

eyes. It usuallv takes an EPROM eraser lU (0 20nunUlt·s to erase an EPROM.

The next variant on ROM is flash memory, sCtlwtinlCS called flash. flash

mernor ies are similar to PROMs, except that be erased and rewritten

by presenting signals to their input pins. the

2:5 MEMORY

37

itself can change the program in the flash. However, there are a few limitations of flash memory that you should know about:

You can write new data into flash memory only a certain number of times before it wears out, typically on the order of 10,000 til11es4

In most flash memories you have to write a whole block of data, say 236 bytes or maybe even 4K bytes, at one time. There is no way to write just 1 byte or 4 bytes.

The writing process is very slow (unlike the reading process, which is fast), taking on the. order of several milliseconds to write a newblock of data into the flash.

The microprocessor usually can't fetch instructions from flash during the several milliseconds that it takes to write new data into the flash. even if the part of the flash that is changing does not include the program. Therefore, the flashprogramming program itself has to be stored somewhere else, at least when it is actually running.

For these reasons, the most typical use of flash memory is to store a program or rarely changed configuration data such as all IP address or the date on which the product should next be serviced and the diagnostic-programs run.

The next variant is Electrically ·Erasable Read-Only Memory, or EEROM ("ee-ec-rahm" or "double-co rahrn"), sometimes called EEPROM (the P in the middle standing for "programmable," as you might guess). EEROM is very

similar to flash memory, except that /

I

• Both the writing process and the reading process are very slow in 311 EEROM.

In fact, some EEROMs require that you write a little software routine to get data into and out of them one bit at a time.

• EEROMs often store only a very little data, often less than 1 K or so.

I You can write new data into an EEROM only a certain number of times before it wears out, but that number is often on the order of millions of times, so in many applications the limit doesn't matter.

Because of these characteristics, EEROM is useless for storing a program. It is used instead to store configuration information that might change relatively

4. All of the quantitative characteristics mentioned in this book about memory parts were current when the book was wr itren. However; as this is an area of rapid development and evolution, you should assume that they may have changed by now,

38 HARDWARE FUNDAMENTALS FOR THE SOFTWARE ENGINEER

frequently but that the system should recover on power-up; for example, as a network address, data rates, user names, number of pages printed, miles driven, etc.

See TabJe 2.1 for a comparison of the various kinds of memory.

Table 2.1 Types df Memory (Continued)
Read Write Write
Technology Speed Speed Times
Flash Fast Slow 10,000 Table 2.1 Types of Memory
Read Write Write
Technology Speed Speed Times
ROM Fast N/A 0
(masked
ROM) PROM

N/A

Fast

EPROM

N/A

Many

Fast

Comments

ROM is useful for programs. It is programmed at the semiconductor factory. After an initial setup charge, ROMs are the least expensive type of permanent memory, and they are thus the best choice for a product with large volumes.

In general, although they are not quite as fast as RAMs, ROMs are still fast enough to allow most microprocessors to execute programs directly from them.

EEROM

Slow Slow 1,000,000

PROM also is useful for programs. It is shipped from

the factory blank, and you use a PROM programmer to program it. PROM is useful for products with lower volumes, since there is no setup charge, but it is more expensive than ROM.

EPROM is also shipped from the factory blank and is programmed with a PROM programmer. It can be erased by shining a strong ultraviolet light on it for 10 or 20 minutes and then reused; it is therefore useful when you are debugging a program.

RAM

Very fast Very fast Infinite

2.5 MEMORY 39

Comments

Flash is useful for storing programs. The principal advantage of flash over the various other kinds of program memory is that it can be written to even after the product is shipped; for example, to upgrade to a new software version. Since it cannot be written to quickly, however, it is unsuitable for rapidly changing data. You £'all store data in flash, but you cannot change that data very often.

EEROM is useful for storing data that must be remembered when the power goes off. Since both reading from and writing to EEROMs are slow processes, EEROMs are not suitable for pr';lgrams or for working data.

iAM is useful for data. Also, some very fast microprocessors would be slowed down if they executed the program directly from any flavor of ROM; in these cases, it is sometimes useful to copy the program from ROM to RAM at power-up time.

Random Access Memory

Every computer system needs a memory area in which to store the data on which it is working. This memory area is almost invariably made up of Random Access Memory, or RAM ("ram"). The general characteristics of RAM are

listed below:

40

-H A 11. o \"9 AId FUl';DAMEI\'T ALS FOR THE' SOFTWARE ENGI::\EER

41

R 'The standard semiconductor gates perfor m Boolean NOT, AND, OR, and XOR functions on their inputs,

m In addition to their input and output pins, most chips have a pin to be connected to vee and a pin to be connected to ground, These pins provide power to run

the chip,

I Decoupling capacitors prevent local brownouts in a circuit. • A signal that no output is driving is a floating signa1.

! Open collector devices can frive their outputs low or let them float but they cannot drive them high. You can connect multiple open collector outputs to the same signal; that signal will be low ifany output is driving low,

m Tri-state devices can drive their outputs high or low or let them float. You can connect multiple tri-state OUtputs to the same signal, but you must ensure that only one of the outputs is driving the signal at anyone time and that the rest are letting the signal Boat.

I A dot on a sch~rnatic indicates that crossing lines represent signals that arc to be connected to one another.

I A single output can drive only a limited number of inputs. Too m;my inputs leads to an overloaded signal.

I Timing diagrams show the timing relationship among' events in a circuit.

a The various important timings for most chips are the hold time, the setup time, and the clock-to-Q time.

The microprocessor can read the data from the RAM quickly, faster even than from ROM.

I The microprocessor can write new data to the RAM quickly, erasing the old data 111 the I<...AMas it docs so.

I RAM fiH'gcts its data if the power is turned off.

the Rll1\1 is not a good place for a bootstrap program, because it

would be on power failure. However, RAM is the only possible place

to store data that needs to be read and written quickly,

systems usc two types of RAM: static RAM and dynamic RAM. Statlc RAM remembers its data without any assistance from other parts of the rircuit. Dynamic RAM, on the other hand, depends on being read once llJ a while; otherwise, it forgets its data, To solve this problem, systems employing dynamic RAM use a circuit-v-oficn built into the microprocessor-ecalled dynamic RAM refresh, whose sole purpose 1S to read data from the dynamic RAM periodically to make sure that the data stays valid. This may seem like :1 lot of complicatioll that you call avoid by using static RAM instead of dvnarnic R/\M, but the payoffis that dynamicR.A'vl is comparatively cheap.

Static RAM perm look much like ROM parts, except that they have a write enable signal in addition to the other signals, which tells the RAM when it should store new data, Dynamic RAM is morecomplex and is quite different; a discussion of how circuits containing dynamic RAM must be built is beyond the scope of [his book,

Chapter Summary

D flip-flops are l-bit memory devices.

The most common types of memory are RAM, ROM, PROM, EPROM, EEROM, and flash. Since they each have unique characteristics, YOLl will use them for different things,

I Ivlost semiconductor parts, chips, are sold in plastic or ceramic packages. They are connected to one another by being soldered to printed circuit boards.

Elcru iral engineers draw schematic diagrams to indicate what parts are needed in each circuit and how they are to be connected to one another. Names are often assigned to signals on schematics,

• Digital signals are always in one of two states: high and low. A signal is said to be asserted when the condition that it signals is true. Some signals are asserted when they are high; others, when they are low.

I Each c111p has a collection of pins that are inputs, and a collection that are outputs.

In most cases, each signal must be driven by exactly one output, although it can be connected to multiple inputs.

Problems

1. In what kind ofmemory would you store each of the following?

• The program for an intelligent VCR of which your company hopes to sell 1 0 million units,

• A user-configurable name for a printer attached to a network that the printer

should remember even if the power fails.

• The program for a beta version of an x-ray machine that your company is about to ship to several hospitals on an experimental basis.

• The data that your program Just received frOIIl the network.

42

43

PROBUMS

HARDWARE FUNDAMENTALS fOR THE SOFTWARE ENGINEER

In!

Figure 2.26 Circuit for Question 3

Figure 2.27 Circuit for Question 5


D .s.,

CLKIN CLK l> "QI SIGNALOUT
'-../
I
,
\ InO

Out3

2. Write out the truth table for a three-input AND gate.

3. What does the circuit in Figure 2.26 do?

4. You can buy a three-input NAND gate, but nobody makes a three-input NAND gate such as the one shown in Figure 2.10, in which some of the inputs are negated. How would you expect to see that circuit really appear on a schematic?

5. What does the circuit in Figure 2.27 do?

6. What does the circuit in Figure 2.28 do? Why would anyone do this?

7. Examine the circuit in Figure 2.29. The idea is that the circuitry on the lefthand side is always running, but the circuitry on the right-hand side gets turned on and offfrom time to time to save power. The capacitor shown in the middle of the diagram is intended to cushion the voltage when the switch is closed. What is wrong with this design? What will the symptoms most likely be? How should it be fixed?

8. Why does the circuit in Figure 2.19 solve the loading problem? How does the circuit in Figure 2.19 compare to the circuit in Figure 2.18?

9. What does the timing diagram for a static RAM look like? Remember to include both a read cycle and a write cycle.

Out2

Outl

OutO

Figure 2.28 Circuit for Question 6

L

INPUT· =D--O-U-T-P-U-T-

Figure 2.29 Circuit for Question 7

S it h

VCC WI C

-
=~
Capacitor
'------
Ground

3 .. 1

Advanced Hardware Pundarnentals

T his chapter is a continuation previous one. \Ve'll discuss the various

parts you will commonly find in an embedded-system circuit.

Microprocessors

Microprocessors come in all varieties from the very simple to the very complex, but in the fundamental operations that they perform. they are very similar to one another. In this section, we will discuss a very basic microprocessor, so basic that no one makes one quite this simple. However, it shares characteristics with every other microprocessor. It 1las the following signals, as shown in Figure 3.1:

I A collection of address signals it uses to tell the various other parts ofthc circuit-> memory, tor cxamplc+-thc addresses it wants to read from or \;'rrite to.

I A collection of data signals it uses to get data from and send data to other pans in the circuit.

• A READ! line, which it pulses or strobes low when it wants to get data, and a WRITE! line, which it pulses low when it wants to write data out.

II A clock signal input, whir h paces all of the work that the microprocessor docs and, as a consequence. p,)ces the work in the rest of the system. Some microprocessors have two clock inputs to allow the designer to attach the crystal circuits dis~Llssed inChapter 2 to them.

46

ADVANCED HARDWARE FUNDAMENTALS

Figure 3.1 A Very Basic

3 .. 2

I

i

i

I

r

~-'-'-"Dn

CLOCK1 -- - -ll rl-RCAD.

ClOCK, ---- _ b---- WRIT EI

An .---. '~:---1

These signals should look familiar from our discussions ofmernory chips in Chapter 2 .

. Most microprocessors have many more signals than this, and we'll discuss some of them later in the chapter. However, the above collection is all that the microprocessor needs in order to fetch and execute instructions and to save and rerrieve data.

SCHne people use the term microcontroller for the very small end of the range of available microprocessors. Although there is no generally accepted definition for rnicrocontrolkr, most people use it to mean a small, slow microprocessor with some RAM and some ROM built in, limited or no capability for using RAM and ROM other than what is built in, and a collection of pins that can be set high or low or sensed directly by the .software. Since the principles of programming a microcontroller are the same as those for programming 3 microprocessor, and since manufacturers build these parts with every combination of capabilities imaginable; in this book we will use the term microprocessor to rnean both.

Buses

a ROM, and a

DO through D7.

K of memory and thus has 16 address lines,

AO The ROM and the RAM AO through AlA.

32 K and thus have 15 address lines each,

the address on the microprocessor

are connected to the address and data on the ROM and the R.AM.

The REi\D! the microprocessor is connected to the output enable

(OE/l on the mem.ory The write SIgnal for the microprocessor is

connected to the (WEI) signal on the ItAM. Some kind of clock

circuit is to the clock signals on the microprocessor.

The address signals as a group are very often referred to as the address bus.

Similarly, the data signals are often referred to as the data bus. The combination of the \\;10, . the READ and \VRITE sigl181s from the processor, are refer red t(US the microprocessor bus, or as the bus. The schematic in Figure 3.2 in which all of the signals that arc part of a bus

are drawn as a line rather than ;,S J collection of the 8 or l 6 (or

lines. Individual signals from the bus branch off of the heavy line .ard are

labeled wherever connect to some part in the circuit.

How docs this circuit deal with the fact that the microprocessor might want to read either from R /\1\1 or trom ROM? from the microprocessor's point of view, .therc are no ROM and RAM chips. Itjust has a 64 K address space, and when it drives address on the address bus to represent one of the addresses in this address sp.ice, it expects circuitry out there somewhere to provide data

on the data bus. 'TI.l make sure that the microprocessor can read from either, you must divide up the address space, assigning some of it to the ROM and some to the R.AM. Then voumust build circuitrvtojmplemcnr your division. SincetbeROM and the RAM each have 32 K, one possible division is shown in 'Table 3.1

Y()U can do the arithmetic and see rhatborhof the ranges in Table 3.1 are 32 K.'1b use these address ranges, you mustbuild a circuit that activate, the

48

ADVANCED HARDWARE FUNDAMENTALS

--- ~~---~---~---------~~~-~

3.2 BUSES

49

Table 3.1 A Possible Division of the Address Space

Figure 3.2 A Very Basic Microprocessor System

Low Address

High Address ()x7flf

binary: 0111111111111111

Oxffif

binary: 1111111111111111

ROM

CixOOOO

binary: O()()OOOOOOOOOOOOO

CPU

WRITE/ 'll --f----. <,

.. """

L. .!D

A1S ~_J

RAM

Ox8000

binary: 100000()()OOOOOOOO

enabling it whenever A15 is O. The A15 signal is inverted and then attached to the chip enable signal on the RAM, enabling the RAM ";"'henever A15 is 1.

As an example, consider what happens if the microprocessor tries to read from address Ox9123. The A15 signal will be a 1 (because Ox9123 is 1001000100100011 in binary), which means that the chip enable signal on the ROM will be high, and the ROM will therefore be disabled. But because the A15 signal is high, the output of the inverter at the bottom of Figure 3.2 will be low, enabling the RAM. The RAlvi will place the data from its cell number Ox1123 on the bus, Ox1123 (not Ox9123) because the A15 signal is not part of the address bus that goes to the RAM; the RAM sees only 001000100100011. See Figure 3.3.

A15

Figure 3.3 Another Look at the Address Space
II Oxffff Ox7fff
RAM addresses
Microprocessor Ox8000 OxOOOO

addresses
Ox7fff Ox7fff
ROM addresses
L OxOOOO OxOOOO ROM chip when an address in its range appears on the bus and chat activates the RAM chip when an address in its range appears on the bus. In this particular case this is simple. Notice that 1Il all of the addresses that correspond to ROM, the highest-order address signal (A15) IS 0, whereas in all of the addresses that correspond. to RAM, A15i5 1. Therefore, you can use the A15 signal to decide which of the two chips-ROM orR.AM-should be activated. In Figure 3.2 you can see that A15 is atsached to the chip enable (CE/) signal on the ROM,

ADVANCI'lJ

50

FVNDAME.:sTAtS

51

Additional Devices

the Bus

not used by any of the rnemory

Some microprocessors allow an alternative mechanism because they support two address spa(cs: the memory address space, which we have already discussed, and an I/O address space, A microprocessor that supports an I/O address space has one or two additional pms with which. it signals whether it is reading or writing in the memory address space or in the I/O address space, Different microprocessors signal this in different ways; perhaps the most common is a single pin that the microprocessor doves low for the memory

space and high for the I10 address space.

Microprocessors that support an I/O address space have extra assembly language instructions for that. The I\10VE instruction reads from or ~rites to memory; instructions such as "IN" and "OUT" access devices III the flO address space, The libraries of the C compilers for these microprocessors typically contain functions to read to and write from devices in the I/O address space, with names such as i nport, out por t, i np, outp, i nbyte, i nword,jnpw, and so 011. The code fragment shown here illustrates typical use of thes~ functions.

In addition to

the ROM, and the RAM, most embedded

systems underground tank monitoring system must have hardware to capture the float levels: the cordless bar-code scanner must have some device to send devices must be ,_valle""

to able co

the microprocessor

used to connect the and the memory; the address and data

that make up the bus connect to the additional devices as well.

hardware the network chip bus.

i/define NETWORK __ CHIP_STATUS (Ox80000)

ildefine NETWORI(_CHIP _CONTROL (Ox80001)

void vFunction ()

microprocessor. is a sample of code to use a memory-mapped device,

BYTE byStatus;

i/define NETWORK_CHI~STATUS ((BYTE *) Ox80000)

/* Read the status from the network. chip. */ byStatus - lnp (NETWORK CHIPSTATUS);

void vFunction ()

1* Write a control byte to the network chip. *1 out p (NETWORK_CHIP_CONTROL, Ox23);

BYTE byStatus;

BYTE *p_byHardware;

1* Set up a pointe~ ta the network chip, */ p byHardware ~ NEiWORK_CHIP_ STATUS;

Read the status from the ne two rs chip, */ byst a tus ,c*p_byHa rdwa re;

Figure 3.4 is an example ofa system with one device in the I/O address space (DV 1) and another device in the memory address space (DV2), The hypothetical Tmcroprocessor 1Il the system sets the I/O SIgnal high to read or write ll1 the 110 address space and low to read or write 111 the memory address space, The gate ill the upper right-hand corner of the schematic that drives the memory enable signal (MEMEN/) asserts that signal low when the liD signal and A19 are both low. TillS enables the memory 1ll the memory address space in the range

52

AD\/'ANCLOHARD,X/ARf: FLNDAMENTALS

Figure .),4 Memory MappIng and the I/O Address Space

53

3.2 BUSES

is low, Since nvz' has three address it appears in the memory address

space in the range from Ox80000 to Oxg0007,

(Note that since this circuit asserts DVl's chip enable signal whenever A1 ') and TlO are high and docs not check A18 through AS, the circuit can read from or write to DVl no matter what the values of those address Effectively, whatever the circuit reads from Ox800()O, it can also read from Ox80100, Ox80200, Ox8feOO, and so on, Similarly, the same data dppears at

multiple addresses in device DV2, This sort of thing is embedded

common in

'J hese signals 'go to the RUM and RAI\1, as before.

Bus Handshaking

In addition to the logic problems of hooking up the address and data busses correctly in figure 3,2, another issue that must be resolved j" the problem of timing. As we discussed in the last chapter, the ROM and the RAM will have various timing requirements: the address lines must stay stable for a certain period of time, and the read enable and chip enable lines must be asserted for some period of time; only then will the data be valid on the bus, The microprocessor is in control of all of these signals, and it decides when to look tor data on the bus, This entire process is called a bus cycle. For the circuit to work, the signals that the microprocessor produces must conform to the requirements of the other parts in the circuit. The various mechanisms by which this can be accomplished arc referred to collectively as bus handshaking, Several of these mechanisms are discussed below, One of them requires the active cooperation of the software,

r~---~' DO

No Handshake

If there is no bus handshaking, then the microprocessor just drives the at

whatever speed suits it, and it is up to the other parts of the circuit to up, In

this scenario, the hardware engineer must select parts for the CIrcuit that can

up with the microprocessor (or, conversely, buy a rrucroproccssor that is slow enough that it won't get ahead of the other parts), As we discussed in Chapter 2, you can buy ROMs and RIVv1s that run at various speeds, For example, you can purchase ROMs that respond in 120,90, or 70 nanoseconds, depending on how fast they must be to up with your microprocessor on how much you're willing to pay),

ii'o11l0xOOOOO to Ox7fflf. The gate below DV 1 asserts the chip enable signal to DVl when A19 andI/O are both high, Since DVl has eight address signals, it appears in the I/O address space in the range from OxHOOOU to Ox800ff. The gate below DV2 assertsthechip enable signal to DV2 when A19 is high and I/O

----------

3.2 BCSES

55

Signals

'w.nrpc.cn •• ·•· ofh:,:t a second rnemorv can use to (,;xfc'ncI the bus

3.5. the top .baIf of the

85

however,

it can assert the WAIT signal to make the microprocessor extend the bus cycle. As long as the WAIT signal is asserted, the microprocessor will wait indefinitely for the device to put the data on the bus. This is illustrated in the lower half of the figure.

The only disadvantage of using a WAIT signal is that ROMs and RAMs don't come from the manufacturer with a wait signal, so someone has to build the circuitry to drive the wait signal correctly, and this can take up engineering time to design and cost money to build.

Figure 3.5 Bus

Wait States (and Performance)

Normal bus cycle

no-r»,

Some microprocessors offer a third alternative for dealing with slower memory devices-wait states. To undcrstand wait states, you need first to understand how the microprocessor times the signals on the bus in the first place. The microprocessor has a clock input, as we've mentioned. and it uses this dock to time all of its activities, in particular its interaction with the bus. Examine Figure 3.6.

J-\O-An

REi\D/

Figure 3.6 The Microprocessor Clock Times the Bus

WAIT

-_ -_- ----.-~.------~-----

---~- -_ -----

Clock

~'" cycle exrended by asserting the WAIT signal

DODn

, ,

:~,'--------------~--~

AO-An ~-----'-

,

, ,

,

, ,

DO,--Dn -.~ -+--(~ _

AO-An

, ,

, ,

" ,

_J_-:>-""--i------------~

" ,

, ,

, ,

,

,

: :~---------------

--------;-----{ :

,

Microprocessor I Start of the

re~l(b' the data next bus cycle

frorn the bus.

KEf\D/

, , ,

READ/-----:---~

I I' \________

, ,

, ,

Microprocessor drives address

bus to start

the bus cycle.

WillI'

l'he long as it needs to, and the microprocessor will wait.

Microprocessor drives REiH) low.

End ofthe bus cycle

56

~--~.------.

3.3 DIRECT MEMORY ACCESS

ADVANCED HARDWARE FUNDAMENTALS

Each of the signal changes during the bus cycle happens at a certain time in relation to the microprocessor's input clock signal. The clock cycles in a single bus cycle are typically labeled T1, T2, T3, etc. The microprocessor shown in this figure behaves as follows (This is essentially the timing of a Zilog Z80.):

I It outputs the address on the rising edge of 1'1; that is, when the clock signal transitions from low to high in the first clock cycle of the bus cycle.

; It asserts the READ/ line at the falling edge of 1'1.

Ii It expects the data to be valid and actually takes the data in just a little after the rising edge ofT3 (shown by the third vertical line in the figure).

I It de-asserts the READ/ line at the falling edge of T3 and shortly thereafter stops driving the address signals, thereby completing the transaction. The next clock cycle would be T1 of the following bus cycle, and if the microprocessor is ready, It will drive another address onto the address bus to start another bus cycle.

If this microprocessor is capable of using wait states, then it will be able to insert extra clock cycles, typically between cycles T2 and T3. See Figure 3.7. The beginning of the bus cycle is the same as before, with the microprocessor driving the address signals and the READ/ signal at the start of the cycle. However, the microprocessor then waits one extra bus cycle before reading the data and completing the cycle. A piece of circuitry inside the microprocessor called a wait state generator is responsible for this behavior.

Most wait state generators allow software to tell them how many wait states to insert into each bus cycle, up to some maximum, perhaps thee or perhaps fifteen. Most microprocessors also allow you to use different numbers of wait states for different parts of the address space. This latter is useful because some devices are much faster than others: RAM, for example, is typically faster than ROM; I/O devices tend to be slow.

The typical microprocessor inserts the maximum number of wait states into every bus cycle when it is first powered up or is reset. This means that the hardware engineer can use a slow ROM ifhe or she wants to save some money. It also means that the processor will start off very slowly, even if the hardware engineer decides to pay for a fast ROM. It is obvious that the fewer wait states that your system is using the faster it will run. It is up to software engineers to find out from the hardware engineers how few wait states they can get away with, and then write code to set up the wait state generator accordingly.

3.,3

57

Figure 3.7 The Microprocessor Adds a Wait State

Clock

T2

1"W

AO--An

DO-Dn

READ!

Microprocessor drives address bus to start the bus cycle.

,

,

Microprocessor drives RE.AD low.

Microprocessor reads the data from the bus.

End of the bus cycle

Direct Memory Access

One way to get data into and out of systems quickly is to use direct memory access or DMA. DMA is circuitry that can read data from an I/O device, such as a serial port or a network, and then write it into memory or read from memory and write to an lIO device. all without.software assistance and the associated overhead. However, DMA creates some new problems for hardware designers to resolve. The first difficulty is that the memory only has one set of address and data signals, and DIV1A must make sure that it is not trying to drive those signals at the same time as the microprocessor is trying to drive them.

This is usually solved in a manner similar to that shown in Figure 3.8. In all of the discussion that follows, we will discuss transferring data from the I/O device to the RAM; the process for moving data in the other direction is similar.

When the I/O device has data to be moved into the RAM, it asserts the DMAREQ signal to the DMA circuit. The DMA circuit in turn asserts the BUSREQ signal to the microprocessor. \Vhen the microprocessor is ready to give up the bus-s-which may mean not executing instructions for the short

58

_-- __ . __ . ----

ADVANCED HARDWARE FCNDAMENTALS

3.9 DM!.,

Figure 3.8 Architecture of a System with DMA

BUSRE

I Address bus, REA])/ and WRITE/
Micro
processor RAM
Data bus

Q BUSACK

DMA I/O
DMAACK

'----- DMAREQ Div1AlzEQ

DMAACK

BUSREQ

BUSACK

\VRITE/

Address

Address driven D1v1A

Dat:;

Data driven by liO device

period during which the DMA does its work-it asserts the BUSACK signal. The DMA circuitry then places the address into which the data is to be written on the address bus, asserts DMAACK back to the .I/O device and asserts WRITE/ to the RAM, The I/O device puts the data on the data bus for the RAM, completing the write eyc le.

After the data has been written, the DMA circuitry releases DMAACK, tri-states the address bus, and releases BUSREQ, The microprocessor releases BUSACK and continues executing instructions, A timing diagram for this is shown in figure 3_ 9. Note that Figure 3,9 includes two new timing diagram conventions. First, the cross-hatching in the address and data buses indicates that the circuit we are discussing does not care what the values are and that those buses may be driven by other components during that time. When the cross-hatching ends, it indicates that the other circuits should stop driving those signals. Second, the arrows indicate which edges cause which subsequent edges.

Obviously, the DMA circuitry has to conform to all of the timings required by the I/O device and by the RAM.

One question that must.be dealt with when you are building a circuit with DMA is: how does the DMA know when it should transfer a second byte? In

other words, after it has finished one what will cause the Dl'v1A

to decide that there is another to trans/tor? There are two answers:

~ The Dl'v1A em be edge triggered, meaning that it will transfer a whenever

it sees a rising edge on DMAREQ that DMAR_EQ is asserted high). In this case, the I/O device requesting the data transfer must lower DMAREQ after each byte and then raise it immediatcly-v-when it has another byte.

m The DMA can be level triggered. meaning that it will transfer bytes as long as DMAREQ remains high. In this case, the I/O device can hold DMAREQ high as long as there arc more to transfer, but it must lower DMAREQ quickly when the last byte is transferred.

An alternative way to make DMA work IS shown in Figure 3.10.

The interaction with the DM/\REQ, BUSREQ, and BUSACK signals is the same as before, Once the DMA circuitry has the bus, however, it performs a simple read from the I/O device and captures the data in a register somewhere within the DMA itself Then the DMA circuitry performs a write to the RAM.

60

ADVANCED HARDWARE FUNDAMENTALS

Figure 3.10 Alternative DMA Architecture

1----

Address bus, R1CAD! and WRITE!

CPU

RAM

Data bus

I

13U"'Kb~T-r;USt\CK

_ .L _ ___L~

DMt\ i-----------..I.----,

____ J-

I/O

--D·--M-A-RE--Q-------------L_ ___j

A timing diagram for this is shown in Figure 3.1 L

The advantage of this second architecture over the one shown in Figure 3.8 is that it puts less burden on the I/O device circuitry. The lIO device needs only to be able to assert DMAREQ at appropriate nmes, because the fulfillment of the request looks like a regular read from the perspective of the 1I0 device. On the other hand

3.4

• The DMA circuit is quite a bit more complicated in that it has to be able to store the data.

I It takes about twice as much bus time to transfer the data. since it has to be transferred first to the DMA and then on to the memory.

lfsevcral [/0 devices need to use D,\1l1. simultaneously to move data, your system WIll need a copy of the DMA circuitr v, called a DMA channel, for each one. Some I/O devices corne with DMA channels built into them. 1I0 devices

that can move large .imountsof data quickly, such p.uticularly likely to have a D/vlA channel built m.

network controllers, are

61

3.4 INTERRUPTS

Figure 3.11 Alternate D/vlA Timing

DMAREQ

BUSREQ __ . f

WRITE!

I/O device drives DMA drives

the data bus. the data bus.

Data ~--<==>-<=--=>---

Address ZZZ>--~--===>---DMA drives I/O DMA drives

device address Inemory device

on the bus. address on the bus.

Interrupts

As you probably know. the microprocessor can be interrupted, that is, told to stop doing what it is doing and execute some other piece of software, the interrupt routine. The signal that tells the microprocessor that it is time to run the interrupt routine is the interrupt request or IRQ. Most microprocessors have several external interrupt request input pins on them. The hardware designer can connect them to the interrupt request output pins typically provided on I/O parts to allow those parts to interrupt the processor.

It is typical for the interrupt request signals to be asserted low, and it is typical for the interrupt request pins on I/O devices to be open collectors, so that several of them can share an interrupt request pin on the microprocessor. See the schematic in Figure 3.12.. I/O Device A can interrupt the processor by asserting the signal attached to IRQO/; I/O Device B can interrupt the processor by asserting IRQl!; I/O Devices C and D can interrupt the processor by asserting IRQ2/.

62

.ADVANCED H!\IW\VARE FUNDAMENIALS

Figure 3.13 A System with a UART

Figure:' .12 Interrupt Connections

IRQO! 0----

JR Q I! Jl-- .. _.--.- - - .. -.-.-.J

IRQ2! C))-_ .. -- --·------··------·----',-·--·-----·---··----··CC

CPU

AO Al A2

TXD RXD RTS UART

I!OA

1 I

, I

jl !

110 B

~.

---'-'"\ .. -._-_.

C

liOD

Like DMA channels responding to a DMAREQ signal, the microprocessor's response to the interrupt inputs can be edge triggered or level triggered.

3 .. 5

63

3.5 OTHER COMMON "ARTS

Driver/ Receiver

Connector

very much like some more memory in that when the microprocessor wishes to send data to or receive data from the UART, it puts out the same sequences of signals on the bus (unless the UART is in the I/O address space). As with a ROM or a RAM, external circuitry must figure out when to drive the chip enable signal on the UART.

At the bottom of the UART is a connection into a clock circuit. The clock circuit for the UART is often separate from the microprocessor's clock circuit, because it must run at a frequency that is a multiple of the common bit rates. UART clock circuits typically run at odd rates such as 14.7456 megahertz, simply because 14,745,600 is an even multiple of 28,800, and 28,800 bits per second is a common speed for communications. There is no similar restriction on the clock that drives the microprocessor.

The signals on the right are the ones that go to the serial port: a line for transmitting bits one after another (TXD), a line for receiving bits (RXD), and some standard control lines used in the RS-232 serial protocol (request-tosend, RTS; clear-to-send, CTS: erc.). The lines arc connected to an RS-232 driver/receiver part. The UART usually runs at the standard 3 or 5 volts of the

Other Common Parts

In this section we'll discuss other parts found on many systems.

Universal Asynchronous Receiver /Transmitter

A Universal Asynchronous Receiver/Transmitter or DART is a common device on many systems. Its purpose is to convert data to and from a serial interface, that is, an interface on which the bits that make up the data are sent one after another. A veri common standard tor serial interfaces is the RS-232 interface, used between computers and modems and nowadays often between cOlnputers and mice.

A tvpical Uo.RT and its connections are shown in Figure 3.13. On the lefthand side of the UART are those signals that attach to the bus structures we discussed back in Section 3.1: address lines, data lines. read and write lines, and an interrupt line. hom the perspective ofthe microprocessor, the UART looks

3.5 OTHER COMMON PARTS

65

64

ADVANCED HAHDWAIlE FUNDAMENTA~S

rest of the circuit, but the RS--232 standard specifies that a 0 be represented by +12 volts and a 1 by -12 volts. The driver/receiver part is responsible for takmg the UART output signals and converting them from 0 volts and .3 volts to +12 and -12 volts; and for converting the input signals from the connector from + 12 and- 12 volts to 0 volts and 5 volts.

A typical DART, in common with many other L'O devices, has a handful of internal locations for data, usually called registers, to which the microprocessor can write to control behavior of the UART and to send it data to be transmitted and from which the microprocessor can read to retrieve data that the UART has received. Each register is at a different address within the UART. The typical registers you might fmd in a UART include the following:

• A register into which the microprocessor writes bytes to be transmitted. (The microprocessor writes the data a byte at a time, and the UART will transmit them a bit at a tirnc.)

• A register from which the microprocessor reads received bytes. Note that this might be at the same address within the UART as the previous register, since the manufacturer of the UART can reasonably assume that you will on lv read from this register and only write to the other. Note that it is often the case that you cannot read back data that you have written into registers in UARTs and other devices, whether or not the manufacturer has used the same address for another register.

A register with a collection of bits that indicate any error conditions on received characters (bad parity, had framing, etc.)

• A register the microprocessor writes to tell the UART when to interrupt.

Individual bits 111 that register might indicate that the UART should interrupt when it has received a data byte, when it has sent a byte, when the clear-to-send signal has changed on the port, etc.

I A register tbe microprocessor can read to find OUt why the UART interrupted.

Note that rcad1l1g or writing this or other registers in the UART often has side effects. surh as clearing the lrlrerrupt request and causing the LIART to stop asserting ItS interrupt signal.

• A the lmcroprocessor can write to control the values of request- to-send

and other outgoing signals.

I A register the microprocessor can read to find out the values of the incoming

signals. "

• One or more registers the microprocessor write, ro indicate the data rate.

Typically, Ul'l.RTs can divide their clocks by whatever number you specify. You specify the nurnberby writing it into some registers m the Ul\RT.

Your program controls the UART by reading from and writing to these registers'at appropriate moments.

LIARTs corne with all sorts of bells and whistles, of which the following arc just examples:

On very simple ones, YOLl must write one byte and wait for that byte to be transmitted before writing the next; more complex UARTs contain a First-InFirst-Out butler, or FIFO, that allows your software to get several bytes ahead. The UART will store the bytes and eventually catch up.

I Similarly, 1110re complex UARTs contain FIFOs for data that is being received, relieving your software of the requirement to read one byte before the next one arrives.

• Some UARTs will automatically stop sending data if the clear-to-send signal is not asserted.

• Some UARTs have built-in DMA or at least the logic to cooperate with a DMA channel.

Programmable Array Logic

Most systems require a certain amount of glue circuitry in addition to the microprocessor, the ROM, the RA.M, and the other major parts. Glue circuitry connects outputs that assert high to inputs that assert low, drives chip-enable signals appropriately based on the address signals, and so on. In the past, this glue was often constructed out of individual AND, NAND, and OR gates and inverters. However, circuits with fewer parts are generally cheaper to build and more reliable, so engineers nowadays try to avoid large collections of these simple parts and use instead fewer, more complex parts.

Each system needs its own combination of glue circuitry to work, however, so each one must be designed afresh. No single chip will do the job for any arbitrary system. This problem has led to a class of parts called Programmable Logic Devices or PLDs. These devices allow you to build more or less any small glue circuit you want, even if what you want includes three-input NAND gates in which two of the inputs are inverted.

The smallest (If the [,LDs have 10 to 20 pins and all array of gates that you can hook up after you buy them; these parts are called Programmable Array Logic or PALs. In essence, a PAL has a rather large collection of discrete parts in it and a method by which you can rearrange the connections among these parts and between the parts and the pins. The method usually requires a piece of equipment, a PAL programmer, much as programming PROMs requires a

66

i\DVAl'CED HAHDWARE FCNDAMENTAIS

67

PARTS

3.5 OTHER

PROM programmer. (In fact, there are a number of PRO M programmers that also will program some kinds of PALs.)

Let's suppose that the glue we need for a certain system is as follows:

The ROM is at addresses 0 to Ox3ar; therefore, the glue must assert its chip enable signal when address lines 14 and 15 are both low.

n The UART is at addresses starting at Ox40()O; therefore, the glue must assert its chip enable signal when address line 15 IS low and address line 14 is high.

The RAM is at addresses Ox8000 to Oxffif; therefore, the glue must assert its chip enable signal when address line 15 is high.

The ROM and the UART are slow devices, and the processor can be made to extend its cycle with a WAIT signal. The: WAIT signal needs to be asserted for two processor clock cycles whenever the ROM or the UART is used.

If we build this system with a PAL, the schematic might look something like the one in Figure 3.1'4. The data and address busses and the READ/ and WRITE/lines are hooked up as we discussed earlier, but the PAL takes in A14,

Figure 3.14 A Circuit with a PAL In It

Data. address, RE1I.D/. and WRITEI

Figure 3.1 S PAL Code

Declarations

AddrDecode DEVICE 'P22VIO'

"INPUTSIf
,£\15 PIN
A14 PIN 2
iel k PIN 3 "OUTPUTS"

! Ramee PIN 19

l Ua r t Ce PIN 18 ! Romee PIN 17

Wait PIN 16

Wait? PIN 15

Equations RamCe ~ Alb

Romee - !A15 * IA14 UartCe !AlS * A14

Wait.elK~" iClk Wait2.CLK ~ iClk

Wait t = (Ramee + UartCe) * iWait2 WaitZ :- Wait" !Wait2

L~~ ROMCEI

CPU

Ai4 1

l __ ~~1_ ~~~---~.-

elK 1

,

, PAL

t

RAM

UARTCEI

UART

end AdcJrDecod.e

A 15, and the processor clock and generates the various chip enables and the WAIT signal back to the processor.

Obviously, it's a little difficult to determine how the circuit in Figure 3.14 works without knowing something about how the PAL works. To know how the PAL works, you need to know the PAL equations, which describe what the PAL does. The PAL equations can be written in any of several languages that have been created for the purpose. An example is in Figure 3.15.

This PAL code starts by declaring that we will build a device named Add rOecode, which will be created in a P22V10 (one of the standard PALs you can buy).

ROM

68

69

ADVANCED l-L~lWWARE FUNDAMENTALS

The sections on "iNPUTS" and "OUTPUTS" assign names to each of the pins that we will use. An exclamation point on the pin drcl.uanon indicates that the signal on that pin is asserted low. Subsequently. whenever the name that corresponds to the pin IS set to 1, the pin itself will go low

The Equations section tells how the outputs depend upon the various inputs.

The first three cquanons determine how the PAL will drive (he chip enable signals. For example:

Figure 3.16 PAL Timing

i C1 k

, '

, :~. :r-\ :r~ r=: ~*"._j ~. . .. ~·-r' \__/-

, I

A15

,

, '

:=s__):-': ,___;_~ ____.;_ .. ~- . / / ZL

A14

, , ,

~ ~ ~ __ ~=zz

RamCe -- A15

will assert the Ramee signal whenever A15 is high. Since A15 is high for addresses Ox/lOOO to Oxtllf, the RAM chip enable will be asserted whenever the microprocessor puts out an address in the RAJ\1's range. Note that because the pin declaration for the Ramee signal indicates that it is asserted when it i'; low, pin 19 of the PA L will go low and select the RAM whenever the microprocessor selects an address 111 the range from Ox8C100 to Oxmf

Similarly, the equation for RamCe asserts that signal whenever 1,14 and

A15 are both low, that is, when the address is between and Ox3tTl~ and

the equation for Ua r t Ce asserts that signal in the address range from Ox4000 to Ox7fff. In this language, the asterisk represents a logical AND and the plus sign represents OR.

The equations for Wa it and Wa i t2 are a little different from those for the chip enable lines. The equations for the chip enable lines arc combinatorial. They depend only 11pOll the levels of the signals on the right-hand sides of the equation, and the output signals named on the left-hand sides of the equations change as soon as the input signals on the right-hand SIde change. The equations for ,ia i t and Wa i t2 are docked. The equations are only evaluated by the PALand the Wa i t and Wa i t2 outputs are only changed-e-on the edge of the given clock signal. 'Ibis is behavior similar to that of the 0 flip-flop discussed in Chapter 2. The difference between the two types of equations 1S shown 111 this PAL equation language by the use of the colon (:) in front of the equals sign.

Because of these two lines among the equations

RomCe!

,

, ,

_ ____v-t---

, ,

I ,

, ,

__j__ .. Y

, I

,

,

-~--------

, ,

Wait

Wait2

---~-~-~---------

, ,

, ,

RomCe/ changes immediately when A14 and AL5

'-----------~ Wait andWait2 change only on rising edges of i.C 1 k.

Wait.elK - iC1k

Wait2.ClK ~ k

On the first rising edge of i C1 k after RomCe or Ua rtCe is asserted, Wa it will be asserted, but the equation for Wa i t2 will evaluate to Fi\LSE, because it will usc the old value of theWa it signal in its calculation. On the second rising edge of i C1 k, Wait will remain asserted (because none of the signals on the righthand SIde of the equation will have changed), but now Wai t.2 will go high. On the third rising edge of t C'l k, WaH and Wait2 will both go low,

Figure 3.16 shows the timing diagram for this PAL. Note how RamCe/ reacts immediately to A14 and 1,15. Note how Wait and Wait2 react only when iC1 k

rises.

Most PAL languages have a few other features:

the clock that causes the Wa it and Wa i t2 equation, to be evaluated is the rising edge oftheiC1k signal. Wait and Wait2 will he low nr first. See how these equations work:

Wait :- (Ramee + UartCe) * lWait2 Wait2 :- Wait * !Wait2

They allow the programmer to put a sequence of test vectors into the program, a sequence of inputs and a sequence of expected outputs. The device that prognl11s the PAL uses these vectors to ensure that the PAL operates correctly after it is programmed.

I They have mechanisms to allow the engineer to build state machines easily.

70

A D V A NeE D H A R f:) VV' All F FUN D AM~~·~-~:-A~-;-~----"·-'---------·------'----·------·----

3.5 UHlER COMMON

71

Application-Specific Integrated Circuits and Field-Programmable Gate Arrays

Two other kinds of parts vou are likelv to find on modern circuits are

Application-Specific Integrated Circuits' ASICs, pronounced

and Field-Programmable Gate Arrays (or FPGAs). These parts are mcreas-

because are an economirnl wav to create custom, rnm"lc'v

hardware on a crrcuit without a lot

flu ASIC is an CIrcuit built. to go into the circuit for which

an ASIC can contain whatever the hardware engineer

J11 pr;;ctice !\SICs consist of a core of some kind, typically

plus perhaps some modest penpheralsand all of the glue

to hold the CHUJit On the schematic, an ASIC is very often

shewn; as ~ .Iymh01 such jJ1 3.17, which tells you nothing about

what the ASIC does. Therefore, If the circuit you are working with contains

get some of what the ASICs do.

expensive

it If there's in it, most hardware engineers

before they build them.

An FPCA is like a iarge PAL, in that it has a large number of gates in it, and the connections among them can be programmed after the part has been manufactured. Some of these parts are programmed ill a special programming device; others can be programmed by the microprocessor even after the product has been shipped into the field. In some systems, the software must program the FPGAs every time the system starts up.

Watchdog Timers

extremely expensrvc to docu-nent their ASICs

a unless it is restarted. The watchdog timer has an output that pulses should the ever expire, but the idea is that the timer will never expire. Some mechanism allows software to restart the timer whenever it wishes, forcing the timer to start timing its interval over again. If the tuner is ever allowed to expire, the presumption is that the software failed to restart it often enough because the software has crashed.

way that watchdog timers are connected into circuits ShO\<V11 in

Figure 3.18. The output of the watchdog timer is attached to the RESET /

Figure 3.17 An ASIC

Figure 3.1.8

a Watchdog Timer

Dala, address, READi, andWRi1EI

~--I! 1

~./ ''-----._

Microprocessor ! Glue )

I I ~(:'l':J)

Cj~ Rl,SETi _ _~j r--l

[ I I I

I I Watchdog I I

! I!,

~_~ ~'::l"SE1! __ -,_--J . __ fl. ~~I':\R~

l __ . ~_

72

l\DV A NeED· H A R D\N' A IH~ FUN D .A.M EN TALS

73

3.6 BUILT-INS ON THE MICROPROCESSOR

signal on the nucroprocessor: if the timer expires, the pulse on its output signal resets the microprocessor and starts the software over from the beginning. Different watchdog cucuits require different patterns ofsignals on their inputs to restart them; typical is to require any edge on the RESTART signal. Some glue circuitry may be necessary to allow the microprocessor to change the RESTART signal appropri.ite.y

• The timer has an input pin that enables or disables counting. The timer circuit also may be able to function as a counter that counts pulses onthat input pin.

Most timers are set up by writing values into a small collection of registers, typically registers to hold the count and a register with a collection of bits to enable the counter, to reset the interrupt, to control what the timer does to its output pin, if any, and so on.

DMA

3.6

Built-Ins on the Microprocessor

It is not unusual to find a few DMA channels built into a microprocessor chip. Since a DMA channel and the microprocessor contend for the bus, certain processes arc simplified if the DIvlA channel and the microprocessor arc on the same chip.

(If your microprocessor supports some kind of memory mapping, note that the DMA circuitry will most likely bypass it. DMA circuits operate exclusively on the physical memory addresses seen outside of the microprocessor chip.)

Microprocessors, particularly those marketed for embedded systerns, very often come WIth a number of auxiliary circuits built into them. In this section we'll discuss some of them. These auxiliary circuits are usually logically separate from the microproccssor-vthcv're just huilt on the same piece of silicon and then wired directly to the microprocessor. The advantage of these built-ins IS that you get the aux.liarv circuits in your system without havmg to add extra parts.

Each auxiliary CIrcuit, or peripheral, is controlled by writing values to a

small collection of that typically appear at some fixed locations in

the address space. The peripherals usually can interrupt the

microprocessor, just ,,5 if Were completely separate from it; there IS some mechanism that coordinates the interrupts from the on-hoard circuitry and the interrupts coming trorn outside the microprocessor.

I/O pins

Timers

It is common for microprocessors intended for embedded systems to contain anywhere from a few to a few dozen I/O pins. These pins can be configured as outputs that software can set high or low directly, usually by writing to a register, or they can be configured as inputs that software can read, again usually by reading from a register. These pinS can be used for any number of purposes, including the following:

Turning LEDs on or off

I Resetting a watchdog timer

• Reading from. a one-pin or. two-pin EEROM

• Switching from one bank ofRl\M to another if there is more Ri\M than the processor can address

Figure 3.19 shows some of these common uses.

It is commou for microprocessors to have one or more timers, A timer is essentially just a counter that counts the number of microprocessor clock cycles and then causes an mn-rrupt when the count expIres. Here arc a tl'w t'-',HUITS of the usual microprocessor rinrers:

• A pre-scaler divides the nucioprocesscr clock signal by some constant, perhaps 2(), before the SIgnal gets to the timer,

• The counter can reset itself to its initial value when it expIres and then continue to count, so that it can be the source of a regular, periodic interrupt.

I The timer can drive all output pIll on the microprocessor, either causing a pulse whenever the timer expires ot creating a square wave with an edge at every timer expiration.

Address Decoding

As we have seen in some of our earlier discussions, using an address to generate chip enables for the RAM, ROM, and various peripheral chips can be a nuisance. Some microprocessors offer to do some of that address decoding

ADVANCIiD HARDW,~HE FUNDAMENTALS

Figure 19 Uses tor 1/0 Pins

3.,7

for you by having a handful of chip enable output pins that can be connected directly to the other chips, the software has to tell the microprocessor the address ranges that should assert the various chip. enable outputs, Often, you can program the microprocessor to use different numbers of wait states, depending upon which chip enable pin is asserted,

Memory Caches and InstructionPipelines

A number

particularly faster ruse (Reduced Instruction

Set Computer) systems, contain a memory cache or cache on the same chip with the microprocessor. These are small, but extremely fast memories that the microprocessor USeS to speed up its work. The lIncroproceslor endeavors to keep in its cache the data and Instructions it about to need; microprocessor can fetch items that happen to be in the cache when. they are needed mucb

more quickly than it can fetch from separdte memory chips, For the most

part, you can ignore the cache when you are designing program logic.

It affects you you must determine how quickly the program will

3.8

3.8 A SiU"lPLE SCHEMf\TTC

75

execute (because that depends on what is in the cache when you are

to your software (because the cache conceals much about what

the microprocessor IS set' Chapter 10'.

.An instruction pipeline or pipeline is <imilar to a memory cache in that the

to load into the instructions that it will need

for execution more rapidly than if they must differences between pipelines and caches arc that pipelines are much smaller than caches, that the logic behind them b often much simpler, and that the microprocessor usesthem only

tor

not for data.

Conventions Used on Schematics

Several

the simple dIagrams in this

m Signals are not always shown as continuous lines. Each a name;

if two lines on the schematic have the same name, ale conne ted, even

though it isn't explicitly shown, For example, if onr of tile address linescominc out of the llllcroprocessor IS labeled A 1.5," theneverv other line labeled .A15 IS that same address signal.

I The actual pin numbers on the p.irts that will be used in the shown next to each signal coming out of each part.

I Parts numbered PL, 1'2, P3, etc. are connectors. places where we can connect

circuit are

this circuit to external devices,

on th- circuu where a .cusrcmer

is expected to to use the circuit.

together or nor,

upon how he wants

A Sample Schematic

Figure 3,20 is the schematic diagram fin d board distributed by ZiJog, to demonstrate its Z80 180 microprocessor and a cornmunicarion chip called duc sec, which is almost too fancy to be called a UART.A few comments

the schematic are listed below; more guidance about how this circuit works is included in the problems at rhe end of this chapter. With this and with the

76

-----~- ~------~ -------~-~ ~----:-:----::,

3.~ /\ WORD ABUl. f HARDWM"

77

ADVANCED HARDWARE FUNDAMENTALS

Figure 3.20 A Sample Schematic

material we have discussed, you should be able to figure out much about how this circuit works.

I Iere are a few facts about the from the disrussion we have had:

in Figure 3.2U that are not obvious

The labeled PI through 1'4 are indeed connectors, as you might expect. Since this IS a demonstration board. pracu.allv every 'lgnificant on it goes to a connector, Just to make it easy to connect test equipment. On most circuits you would not see so many signals going to connectors.

The part labeled P5.is a connector to which the mer can connect a power supplv. The part$ labeled Jl through J4 are Jumpers J1 and J2 control clock options on the Sec. J 3 and J 4 control how certain address lines connect to mernorv parts. To start with you should assume that pin 2 and pin 3 on J3 have been connected to one another and that pin 2 and pin 3 OIl J4 have been connected to one another.

The part labeled 1'6 is also a connector, but its purpose. h to allow the user to configure the board further by connectmg some of its pins to one another. Assume that the user has connected none of these pins to one another.

Because of the extensive configurabilitvof this board, many have pullup resistors attached to them. This forces them to be high if the user not [(lrCe them low. For example, the signal USRRAM, found on connector P6 and attached to one of the inputs on one of the NAND gates in the lower lcfthand corner of the schematic, will always be high because of the pullup resistor, unless the user connects it directly to ground by connecting pm 1 to pin 6 of connector P6.

I The part labeled US is a programmable logic device that deals with the timing requirements of the sec.

R The part labeled UR is an RS-232 driver.

3.9

A Last Word about Hardware

"'[I . " I

~ ----C>---.--=r=:::- __ '::' __ --_-~ '1--

co

-"-,----

One thmg thatyou will notice whenever you talk to hardware engineers is that they operate with a different set of concerns than do software engineers. Here are some of those concerns:

78

ADVAr-;CED HARDWARE FU"IDAMENTALS

Unlike software, for which the engineering cost is almost all of the cost, every copy ofthe hardware costs money. You have to pay for every part all the circuit every tune you build a new circuit. Now if the total production run is expected to be only 100 units, no one will get very concerned about costs, even about a $10 part. However, if you're planning to ship 30,000 units a month, then it's worth a lot of engineering effort to figure out how to eliminate a 25¢ part-or even a 5¢ part-from the circuit.

Every additional part takes up additional space in the circuit. As companies start to build computers that are not only portable but also are wearable or even concealable, space is often at a premium.

• Every additional part uses up a little power. This is an obvious concern if your product is battery-powered, but even if it is not, more power means that your product will need a larger (and therefore more expensive) power supply.

Every additional part turns the power it uses into heat. Eventually you have to put a bigger fan into your product to get rid of this heat-or, worse, you have to turn a fan-less product into one with a fan.

Faster circuit components cost more, use I110re power, and generate more heat. Therefore, clever software is often a much better way to make a product fast than IS faster hardware.

Because of these considerations, hardware engineers arc inclined to suggest that product functionality is best done in software rather than in additional hardware. '1'111S 15 not because they are lazy; it is because a product with more software and less hardware will in most cases be a better product. Prototypes and other very low volume products for which the software development cost will be a major portion of the total cost are the exceptions to this rule.

Chapter Summary

• A typical microprocessor has at least a collection of address pins, a collection of data pins, one or more clock pins, a read pin, and a write pin.

I The collection of data, address, and control SIgnals that run among the microprocessor, the ROM, and the RAM is called the bus.

I The electrical engineer must ensure that the timing requirements of each of the parts attached to the bus are satisfied. Wait states and wait lines are mechanisms for accomplishing this.

PHOBLEMS

79

Direct memory acce~s (DMA) circuits move data directly from I/O devices to memory and vice versa without micr~rocessor intervention.

I When an I/O device needs attention from the microprocessor, it asserts its interrupt signal to let the microprocessor know.

I A Universal Asynchronous Receiver/Transmitter (UART) converts data between an eight-bit format and the one-bit-at-a-time format used on serial ports such as RS-232 ports. UARTs are controlled by the microprocessor through a collection of registers.

• The simplest form of programmable logic device (PLD) is the programmable array logic (PAL). A PAL contains a collection of gates; you can rearrange the connections among these gates with a special programming language and a PAL programmer.

I An application specific integrated circuit (ASIC) is a part built especially for a given product.

I A watchdog timer resets the microprocessor and starts the software over from the beginning if the software does not restart it pcriodically.

I Typical modern microprocessors intended for embedded systems have built-in timers, DMA, I/O pins, address decoding, and memory caches.

I In addition to making their circuits work, hardware engineers must deal with concerns about cost, power, and heat.

Problems

1. Suppose that your system has two ROM chips and two RAM chips whose sizes and addresses are as shown in the following table. Design the part of the circuit that takes the address lines and produces the chip enable signals for each of these

four memory parts.
Size Low Address High Address
ROM 128 KB OxOOOOO Ox1ffif
ROM 128 KB Ox20000 Ox3ffif
RAM 64 KB Ox80000 Ox8ffif
RAM 64KB Ox90()()0 Ox9ffif 2. Suppose we are using 120 nanosecond ROMs (which have valid data on the bus 120 nanoseconds after the falling edge ofOE!) and are using the microprocessor

80

AP_E FUNVA~\1LNTALS

-I.

discthScd in clock

How many wait states must the microprocessor thar reads from the ROM?

of hookmg up devices C and D in Figure 3.12 to the

3. \.Vh:n : In' the

\Vh:lt .ir.: the advantages and In[l'rnlpLs~

of edge triggered and level-triggered

.ue there

three address pms on the "typical" UART in FIgure 3.13?

h. What orhe r pin- llllght vou find on a DART in addition to those shown in FIgure .).13-;

7. \Vhy lS" FIH) useful t,or received

in a D/\R'F

8. Hoxv !lught you drive an LED if your microprocessor does not have any lIO

9. ,-:m't you usc microprocessor lI() pins as chip enable pms for ROM and

IU\:\V

10. I [mv would von imagine that the EEROM in FIgure 3.19 works? (Note that this. is :1 [Jot uncommon pill configuration for EEROMs.)

Tllf

all apply to the sample circuit shou.n in Figure 3.20.

1 L TIll' SChU1LHlC in fIgure :1.20 contains a nucroprocesvor, a ROM, and a RAM.

E~,Il11illC rh.: connections available on the parts shown on the schematic to d,_tennin,' which part IS the microprocessor; which, the ROM; and which, the Ri\[Y1.

12. Ih c.',lllllnillg rlw conuecuons available on the microprocessor, determine the d HI ,Iddress sp"ce. Similarlv, how big arc the R(Hvl and RAM chips on ttl!' hocmF t'.iso, does this microprocessor have a separate I/O address space?

13. l\"Ulllllh' th.ir :2 and 3 arc attached on jumpcr B, attaching signal A13 to

IU\ i _\, and th.it pins 2 and 3 are attached on Jumper J4, attaching signal ;\ 11 .o RO 14, where does the RAM appear in the microprocessor's address Whl'lc docs the RO.rvl appear) (Note that this latter question is a little tr Ie kin t h.in it ,'ppears, because lignal AJ 6 is attached to the connection for /\ 1:; on the HJ)M.) 'IX/here does the SCC appear!

14. Vlh.il l'; the dtt'Ct of attaching USRlz.AM to ground by connecting pin 6 to pin lOll ,"lllle'nor Ph'

4.1.

Interrupts

4

Having completed our digression into hardware, let's get started with our main subject-vernbedded-sysrem softwarc--starting with the response problem raised in Chapter 1. As discussed in that chapter, the respODse problem IS the difficult one ofmakmg sure that the embedded system reacts rapidlv to external events, even If it is in the middle of doing something else. For example, even if the underground tank monitoring system is busy calculating how much gasoline is in tank number six, it must still respond promptly if rh- user presses a button

requesting to know how much is in tank number two.

The first approach to the response problcrn-v-rhc Oile that we will discuss in this chapter-is to use interrupts. Interrupts cause the microprocessor in the embedded system to suspend doing whatever it is doing and to execute some different code instead, code that will respond to whatever event caused the inu-rrupt. Interrupts can solve the response problem, but not without some difficult progr.unnnng, and not without introducmg some new problems of their own.

Microprocessor Architecture

Before we c.m discuss interrupts you must know about how microprocessors work. If you arc reas()lLlbly familiar wirh .,ssembly langmgeany assembly languagt'--~You can rhis sccrion and on to Section -1.2. Here we arc going to dlscllSs the little bit about microprocessor architecture and assembly language that you need III order to grasp some of the concepts

we'll be discussing. Most .ind their languages are

82

INTEHRLPTS

similar to one another in a general way. We're going to discuss the parts that are similar; we have no need for the details that make microprocessors and assembly languages complicated and make them differ from one another.

If you are not familiar with assembly language, you should know the following:

• Assembly language is the human-readable form of the instructions that the microprocessor really knows how to do. A program called an assembler translates the assembly language into binary numbers before the microprocessor can execute them. hut each assembly-language instruction turns into just one instruction for the microprocessor.

I When the compiler translates C, most statements become multiple instructions for the microprocessor to execute. Most C compilers will produce a listing file that shows the assembly language that would be equivalent to the C.

I Every family of microprocessors has 3 different assembly language, because each family understand, a different set of instructions. Within each family, the assembly languagesf()r the individual microprocessors usually arc almost identical to one another.

The rvpiral microprocessor has within it a set of registers, sometimes called general-purpose registers, each of which can hold a value that the processor is workmg wah. Before doing any opera tion on data, such as arithmetic, tor ex.irnple, most llllcroprocessors must move the data into registers. Each nucroproc.cssor family has a different number of registers and assigns a different collection of names to them. For this discussion, we will assume that our nur roproccssor has registers called Rl, R2, R3, and so on.

In addition to the general-purpose registers. most microprocessors have several special registers. Every microprocessor has a program counter, which keeps track of the address of the next instruction that the microprocessor is to execute. Most have a stack pointer, w-hich stores the memory address of the top of the general purpose microprocessor stack.

In a typical assembly language, when the name of a variable appears in an instruction, that refers to the address of that variable. To refer to the value of a variable, yon put the name of the variable in parentheses. In most assembly languages anything that follows a semicolon IS a comment, and the assembler will ignore it.

The most C0111mon instruction is one that moves data from one place to another:

MOVE P3,R2

83

4.1 MICROPROCESSOR ARCHITECTURE

This instructionreads the value in register R2 and copies it into register R31 Similarly

MOVE R5, (iTemperature)

reads the value of iTemperature from the memory and copies the result into register R5. Note that this instruction

MOVE R5, iTemperature

places the address of i Temperature into register R5.

Although some microprocessors can only do arithmetic in a special register called the accumulator, many can do standard arithmetic or bit-oriented operations in any register. For example

ADD R7,R3

adds the contents of register R3 into register R7. This instruction

NOT R4

inverts all of the bits in register R 4.

Assembly languages have ajump instruction that unconditionally continues execution at the instruction whose label matches the one found in the jump instruction. Labels arc followed by colons in many assembly languages. For example

ADO RI, R2 JUMP NO_ADD MORE ADDITION:

ADD RI, R3 ADD RI. R4

These are skipped

NO_ADD:

MOVE (xyz ) , RI

adds the contents of register R2 to register Rl but then jumps down to the instruction that saves the contents of regIster R 1 in variable xy z without adding in the contents of registers R3 and R4.

Assembly languages also contain conditional jump instructions, instructions that jump if a certain condition is true. Most microprocessors can test

1. In some assembly languages, this instruction would operate in the opposite direction, reading the value in register R3 and copying it into register R2. Assembly languages differ from one another in all sorts of details such as this.

84

INTLH.RU

conditions such as whether the results of a previous arithmetic operation was 0 or greater than () and other similar, simple things, Here is an example:

SUBTRACT .JeOND MOVE

ni , R5

ZERO. NO MORE R3. (xyz)

NOt10RE:

If rq,;l'tel' R 1 and register RS have the same value, then the result of the subtraction \v111 be 0, and the program would jump to the label NO_MORE, If the two registers have unequal values, then the result of the subtraction will not be zero, and the processor will move the value of xyz into register R3,

Most as,,'mbly languages have access to a stack with PUSH and POP instructions, The PUSH instruction adjusts the stack pointer and adds a data item to the stack, The POP instructlOll retrieves the data and adjusts the stack pointer back,

LISt, most languages have a CALL instruction for getting to subrou-

cines or iill1ctiom and a RETURN instruction for getting back, For example:

CAL I, ADD EM - UP
NOVE (xyz) • Rl
A,DD EM
AUD ni , R3
ADD Rl, R4
A.DD Rl, R5
RETURN [he CALL instrurtion typically causes the microprocessor to push the address of the mstrucuon after the CALL--,-ill this case. the address of the MOVE in-trucuon-v-onto the stack. \Vhen it gets to the RETURN instruction, the microprocessor automatically pops that address from the stack to find the next in-tructio n It should execute,

Figure 4,1 has an example of C code and its translation into our assembly

4 .. 2

85

4,2 INTERRUPT BAStCS

Figure 4.1 C and Assembly Language

x ~ y + 133;
MOVE Rl, (y)
ADD Rl. 133
MOVE (x ) .Rl if (x >~ z)

MOVE R2. (z ) SUBTRACT Rl. R2 JeOND NEG, LIOI

z +~ y;

110VE Rl, (y)
ADO R2. Rl
MOVE (z) , R2
w ~ sqrt (7) ;
LlOl :
MOVE Rl, (z)
PUSH Rl
CALL SQRT
MOVE (w) • RI
POP Rl Interrupt Basics

Get t he Ve l ue of y into Rl Add 133

Save the result 1n x

Get the value of z Subtract z from x

Skip if the result is negative

Get the value of y into Rl Add it to z.

Save the result in z

Get the value of Z into Rl

Put the parameter on the stack Call the sqrt function

The result comes back in Rl Throwaway the parameter

In this section we'll discuss what interrupts are, what microprocessors rvnically do when an interrupt happens, what interrupt routines typically do, and how they are usually written, Readers familiar with this material should skip to Section 4,3,

10 begin with, interrupts start with a signal from the hardware. Most I/O chips, such as .oncs that drive serial ports or network interfaces, need attention when certain events OCCLlI, For example, when a serial port chip receives a character from the serial port, it needs the microprocessor to read that character from where it is scored inside of the serial port chip itself and to store it somewhere in memory Similarly, when a serial port chip has finished transmitting one character, it needs the microprocessor to send it the next character to be. transmitted, A network chip-and almost any other kind of I/O chip __ -needs the microprocessor's assistance tor similar, sorts of events.

Each of these chips has a pm that It asserts when it requires service. The hardware engineer attaches this pin to an input pin on the microprocessor called

86

INTERRUPTS

Figure 4.2 Interrupt Hardware

Interrupt request pins.

I

This signal tells the microprocessor that the network chip needs service.

~~~-.---

87

4.2 INTERRUPT BASICS

Figure 4.3 Interrupt Routines

Task Code

Interrupt Routine

MOVE RI. (;Cent; grade) MULTIPLY RI. 9- __

DIVIDE RI. 5 ~

ADD RI. 32 PUSH RI

MOVE (iFarnht). PUSH R2

JeOND ZERO. I09Al JUMP 14403

MOVE R5. 23

PUSH R5

CALL Skiddo

POP R9

MOVE (Answer). RI RETURN

II Read char from hw into Rl

I I Store Rl va l u e into memory

II Reset serial port hw

II Reset interrupt hardware

POP R2 PO P Rl <, RETURN

An interrupt routine is sometimes called an interrupt handler .or an interrupt service routine. It is also sometimes called by the abbreviation [SR.

The last instruction to be executed in an interrupt routine is an assembly language 'RETURN instruction. When it gets there, the microprocessor retrieves from the stack the address of the next instruction it should do (the one it was about to do when the interrupt occurred) and resumes execution from there. In effect, the interrupt routine acts like a subroutine that is called whenever the hardware asserts the interrupt request signal. There is no CALL instruction; the microprocessor does the call automatically in response to the hardware signal.

Figure 4.3 shows a microprocessor responding to an interrupt. On the lefthand side of this figure, the microprocessor is busy doing the task code, the term we will use in this book for any code that is not part of an interrupt routine. (There is no common word for this concept.) The task code in Figure 4.3 is busy converting temperatures from centigrade to Fahrenheit. It moves the centigrade temperature into register R 1. does the necessary arithmetic, and stores the result. When the interrupt occurs, the microprocessor suspends the task code and goes to the instructions that make up the interrupt routine. It does all of those instructions; when it comes to the RETURN instruction at the end of the interrupt routine, it goes back to the task code and continues converting temperatures. {Note that some microproccssors-e-thosc in the Intel x86 family,

an interrupt request, or IRQ, that lets the microprocessor know that some other chip in the circuit wants help. Most microprocessors have several such pins so that several different chips can be connected and request the microprocessor's attention. (See Figure 4.2.)

When the microprocessor detects that a signal attached to one of its interrupt request pIllS is asserted, it stops executing the sequence of instructions it was executing, saves on the stack the address of the instruction that would have been next, and Jumps to an interrupt routine. Interrupt routines are subroutines that you write. subroutines that do whatever needs to be done when the interrupt SIgnal occurs. For example, when the interrupt comes from the serial port chip and that chip has received a character from the serial port, the interrupt routine must read the character fr0111 the serial port chip and put it into memory. Typically, interrupt routines also must do some miscellaneous housekeeping chores, such as resetting the interrupt-detecting hardware within the microprocessor to be ready for the next intcr rupt.

88

INTERRUPTS

for example-have a special 'return from interrupt routine' instruction separate from the regular return instruction you use at the ends of subroutines. When you write interrupt routines for those microprocessors, you must use the special instruction. )

Saving and Restoring the Context

Notice that the task code in Figure 4.3 assumes that the value in register Rl stays put trom one instruction to the next. If the centigrade temperature is 15, then the microprocessor will load 15 into register RI. multiply that by 9 to get 135, and then will expect the 135 to stay there to be divided 5.Ifso111ething changes the value in register Rl in the mean time, then the program won't convert the temperatures properly.

The thing that might change the value in register Rl is the interrupt routine.

If the Interrupt occurs right after the microprocessor finishes the MU L TI P LV instruction, then the rlllcroproccssor will execute the entire interrupt routine before it gets to the DIVIDE instruction. It is therefore necessary that the value 111 register Rl be the same after the interrupt routine finishes as it was before the interrupt routines started.

It is difficult or impossible for a microprocessor to get much done without using at least some of the registers. As we mentioned in Section 4.1, most microprocessors must move data values into the registers before they can operate on them .. Therefore, it is unreasonable to expect anyone to write an interrupt routine that doesn't touch any of the registers. The most common practice to get around this problem is for the interrupt routine to save the contents of the registers it uses at the start of the routine and to restore those contents at the end. Usually, the contents of the registers are saved on the stack. In Figure 4.3 you can sec that the interrupt service routine pushes the values in registers R1 and R2 onto the stack at the beginning and then pops them (in reverse order, note) at the end. Similarly, you must write your interrupt service routines to push and pop ill of the registers they use, since you have no way of knowing what registers will have important values in them when the interrupt occurs.

Pushing all of the registers at the beginning of an interrupt routine is known as saving the context; popping them at the end, as restoring the context. Failing to do these operations properly can cause troublesome bugs. For exampl e, if whoever wrote the interrupt routine in Figure 4.3 had forgotten to save and restore register R 1, then temperatures might not be translated properly.

4.2 INTERRUPT BASICS

89

The distressing thilig about this bug would be that temperatures might well be translated properly most of the time. The bug would only show up occasionally, when the interrupt just happened to occur in the middle of the calculation. As long as the interrupt occurred only when register Rl is not important, the system would appear to work just fine.

Disabling Interrupts

Almost every system allows YOll to disable interrupts, usually in a variety of ways. To begin with, most I/O chips allow your program to tell them not to interrupt, even if they need the microprocessor's attention. This stops the interrupt signal at the source. Further, most microprocessors allow your program to tell them to ignore incoming signals on their interrupt request pins. In most cases your program can select tbe individual interrupt request signals to which the microprocessor should pay attention and [hose it should Ignore, usually by writing a value in special register in the microprocessor. There is almost always a way-often with J smgle assembly-language insrrucuorr+-ro tell the microprocessor to ignme all interrupt requests and a corresponding way to tel! it to start paying attention again.

Most microprocessors have a nonrnaskable interrupt, an input pin that causes an interrupt that cannot be disabled. As we will discuss in Section 4.3, if an interrupt routine shares any data with the task code, there are times when it is necessary to disable that mterrupt. Since yO\1 can't disable the nonmaskable interrupt, the associated interrupt routine must not share any data with the task code. Because of this, the nonmaskable interrupt is most commonly used for events that are completely beyond the normal range of the ordinary processing. For example, you might usc it to allow your system to react to a power failure or a similar catastrophic event.

Some microprocessors usc a somewhat different mechanism for disabling and enabling interrupts. These microprocessors assign a priority to each interrupt request signal and allow your program to specify the priority, of the lowestpriority interrupt that it is willing tohand1e at any given time, It can disable all interrupts (except for the nonrnaskable interrupt) by setting the acceptable priority higher than that of any interrupt, it can enable all-interrupts by setting the acceptable priority very low, and it can selectively enable interrupts in priority order by setting the acceptable priority at intermediate values. This priority mechanism is sometimes in addition to allowing you to enable and disable individual interrupts.

90

INTERRUPTS

Some Common Questions

How does the microprocessor know where to jind the interrupt routine when the interrupt occurs? This depends on the microprocessor, and you'll have to look at the manual to find out how your microprocessor does it. Some microprocessors assume that the interrupt service routine is at a fixed location. For example, if an II 0 chip signals an Intel 8051 CHI .its first interrupt request pin, the 8051 assumes that the interrupt routine is at location Ox0003. It becomes your job to make sure that the interrupt routine is there. Other microprocessors have more sophisticated methods. The most typical is that a table somewhere in memory contains interrupt vectors, the addresses of the interrupt routines. When an interrupt occurs, the microprocessor will look up the address of the interrupt routine in this interrupt vector table. Again, it is your job to set up that table properly.

HoUJ do microprocessors that Hse ilU interrupt vector table know where the table is? Again, this depends upon the microprocessor. In some, the table is always at the same location 111 memory, at OxOOOOO for the Intel 80186, for example. In others, the microprocessor provides your program with some way to tell it where the table is.

Cal! a microprocessor be interrupted in the middle of an instruction? Usually not. In almost every case, the microprocessor will finish the instruction that it is working on before jumping to the interrupt routine. The most common exceptions are those single instructions that move a lot of data from place to place. Both the Zilog Z80 and the Intel x86 families of microprocessors, for example, have single instructions that move potentially thousands of bytes of data. These instructions can be interrupted at the end of transferring a single byte or word and will resume where they left off when the interrupt routine returns.

If two interrupt; happen at the same time, which interrupt routine does the microprocessor dojirst? Almost every microprocessor assigns a priority to each interrupt signal, and the microprocessor will do the interrupt routine associated with the higherpriority signal first. Microprocessors vary all over the map when it comes to how your program can control the priorities of the interrupts.

CaY! an iHtermp! request signa! interrupt another interrupt routine? On most microprocessors, yes. On some microprocessors it is the default behavior; on others, you have to put an instruction or two into your interrupt routines to allow this interrupt nesting. The Intel x86 microprocessors, for example, disable all interrupts automatically whenever they enter any interrupt routine; therefore, the interrupt routines must reenable mterrupts to allow interrupt nesting.

91

4.2 INTERRUPT BASICS

Other processors do':nbt do this, and interrupt nesting happens automatically. In any case, a higher-priority interrupt can interrupt a lower-priority interrupt routine, but not the other way around. If the microprocessor is executing a higher-priority interrupt routine when the hardware asserts the lower-priority interrupt signal, the microprocessor will finish the higher-priority interrupt routine and then execute the lower-priority interrupt routine.

What happens ifan interrupt is signaled while the interrupts are disabled? In most cases the microprocessor will remember the interrupt until interrupts are reenabled, at which point it will jump to the interrupt routine. If more than one interrupt is signaled while interrupts are disabled, the microprocessor will do them in priority order when interrupts are reenabled. Interrupts, therefore, are not really disabled; they are merely deferred.

What happens if I disable interrupts ahd then fo~qet to reenable them? The microprocessor will execute no more interrupt routines, and any processing in your system that depends upon interrupt routines-which is usually all processing in an embedded system-will grind to a halt.

liVlzat happens if I disable interrupts when they are already disabled or enable interrupts when they are already enabled? Nothing.

Are interrupts enabled or disabled when the microprocessor first starts 14p? Disabled. Can I write mv Lnterrupt routines in C? Yes, usually. Most compilers used for embedded-systems code recognize a nonstandard keyword that allows you to tell the compiler that a particular function is an interrupt routine. For example:

void interrupt vHandleTimerIRQ (void) {

The compiler will add code to vHandl eTi mer! RQ to save and restore the context. If yours is one of the microprocessors that requires a special assembly" language RETURN instruction for interrupt routines, the compiler will end, vHandleTi mer! RO with it. Your C code must deal with the hardware properlywhich is usually possible in C-and set up the interrupt vector table with the address of your routine-also usually possible in C. The most common reason for writing interrupt routines in assembly language is that on many microprocessors you can write faster code in assembly language than you can in C. If speed is not an issue, writing your interrupt routines in C is a good idea.

92

INTERRUPTS

4 .. 3

The Shared-Data Problem

One problem that arises as soon as you use interrupts is that your interrupt routine, need to communicate with the rest of your code. For various reasons, some of which we will discuss III Section 4.4, it is usually neither possible nor desirable for the microprocessor to do all its work in interrupt routines. Therefore", interrupt routines need to signal the task code to do follow-up processing. for thi« to happen. the interrupt routines and the task code must share one or more variables that they can me to communicate with one another.

Figure 4.4 illustrates the classic shared-data problem (also called the datasharing problem) you encounter when you start to use interrupts. Suppose that the code 111 Figure 4.4 is part of the nuclear reactor monitoring system we discussed 1Il Chapter 1. 'I'his code monitors two temperatures, which

supposed to he If they differ, ir indicates a malfunction

in your reactor. In the code in FIgure 4.4, the function ma i n stays in an infinite loop making sure that the two temperatures are the ,;;] 111 e. The interrupt routine, v Rea d Tempe rat u res, happens periodicallv: perhaps the temperaturesensing hardware interrupts if one or both of the temperatures changes or perhaps a tuner mterrupts every few milliseconds to cause the nucroprocessor to jump to this routine. The interrupt routine reads the new temperatures. The idea is that the system will set off a howling alarm if the temperatures ever turn out to be ditTerent.

Before reading on, examine the program in Figure 4.4 and try to find the bug.

What IS the problem with the program in figure 4.4? It sets off the alarm when it shouldn't. To see why, suppose that both temperatures have been 73 degrees for a while; both elements of the i Tempe rat u re s array equal 73. Suppose now that the microprocessor has just finished executing this line of code, setting i TempO to 73:

iTempO - iTemperatures[Ol:

Suppose that the interrupt occurs now and that both temperatures have changed to 74 degrees. The interrupt routine writes the value 74 into both elements of the iTemperatures array. When the interrupt routine ends, the microprocessor will continue with this line of code.

iTempl ~ iTemperatures[l]:

Since borhclcmcnts of the array arc now 74, i Temp l will be set to 74. When the mICroprocessor comes to compare i TempO to i Iemp l in the next line of code,

.~------- .-~----

4.3 THE SHARED-DATA PROBLEM

93

Figure 4,4 Classic Shared-Data Problem

static int iTemperatures[2];

'lord interrupt vReadTemperatures (void)
{
iTernperatures[O] If read in value from hsrdwere
iTemperatures[l] = ff read in value from hardware void main (void)

int t Tempu , t Temp l :

while (TRUE) {

iTempO ~ iTemperatures[O); i'em~l - iTemperatures[l]; if (iTempO i- iTempl)

II Set off howling alarm;

they will differ and the system will set off the alarm, even though the two measured temperatures were ahl/ays the same.

Now examine Figure 4.5. The code in Figure 4,5 is the same as the code in Figure 4.4 except that mai n does not copy the temperatures' into its local variables, but tests the elements of the Hemperatures array directly, Does the program in FIgure 4.3 fix the bug in the program in Figure 4.4?

It "would be nice if the program in Figure 4.5 solved the problem that we had in Figure 4.4. However; the same bug that was in Figure 4..4 is also in Figure 4.5, just in a more subtle form. The problem is that the statement that compares iTemperatures[O] with iTemperatures[l] can be interrupted. Although the mlCroprocessor usually will not interrupt individual assemblylanguage instructions, it can interrupt statements in C, since the compiler translates most statements into multiple assembly-language instructions. The statement that compares iTemperatures[OJ with iTemperatures[lJ turns into assembly language that looks something like that shown in Figure 4.6.

Consider what happens if the interrupt occurs between the line of code that loads the value iTemperatures[OI into register Rl and the line of code that

94

--~----------

INTERRUPTS

Figure 4.5 Harder Shared-Data Problem

static int iTemperatures[2);

void interrupt vReadTemperatures (void) {

iTemperatures[O) iTempe ratures (1)

11 read in value from hardware 11 read in value from hardware

void main (void)

whi 1 e (TRUE) {

if (iTemperatures[O) !- iTemperatures[I)) !! Set off howling alarm;

loads the value iTemperatures[l) into register R2. Ifboth temperatures were 73 degrees before the interrupt and both temperatures are 74 degrees after the interrupt, then register RI, loaded before the interrupt, will have the value 73, and register R2, loaded after the interrupt routine returns, will have the

Figure 4.6 Assembly Language Equivalent of Figure 4.5

MOVE MOVE SUBTRACT JCOND

RI. (iTemperatures[O)) R2. (iTemperatures[I)) Rl. R2

ZERO, TEMPERATURES_OK

Code goes here to set off the alarm

TEMPERATURES OK:

4.3 THE SHARED-DATA PROBLEM

95

value 74. Note that the interrupt routine will not change the value in register R 1: it has no way of knowing what that value represents and, as we discussed in Section 4.1, should not change it. The program in Figure 4.5 therefore has exactly the same problem as the program in Figure 4.4.

Characteristics of the Shared-Data Bug

The problem with the code in Figure 4.4 and in Figure 4.5 is that the iTemperatures array is shared between the interrupt routine and the task code. If the interrupt just happens to occur while the rna in routine is using iTemperatures, then the bug shows itself.

Bugs such as these are an especially fiendish species. They are difficult to find, because they do not happen every time the code runs. The assembly-language code in Figure 4.6 shows that the bug appears only if the interrupt occurs between the two critical instructions. If the interrupt occurs at any other time, then the program works perfectly. For the interrupt to occur between the two instructions, the hardware must assert the interrupt signal during the execution of the first of the two critical instructions. Since that execution takes a period of time measured in microseconds or possibly even in fractions of microseconds on a fast processor, the likelihood of an interrupt at just that moment may not be particularly high. In fact, bugs such as this are famous for occurring at times such as these:

• 5 o'clock in the afternoon, usually on Friday

• Any time you are not paying very much attention

• Whenever no debugging equipment is attached to the system.

• After your product has landed on Mars

• And, of course, during customer demos

Because these bugs often show themselves only rarely and are therefore difficult to find, it pays to avoid putting these bugs into your code in the first place. Whenever an interrupt routine and your task code share data, be suspicious and analyze the situation to ensure that you do not have a shareddata bug.

Solving the Shared-Data Problem

The first method of solving the shared-data problem is to disable interrupts whenever your task code uses the shared data. For example, if the dis a b 1 e function disables interrupts and the enable function enables interrupts, then

96

INTERRUFTS

4.3 THE SHARED-DATA PROBLEM

97

Figure 4.7 Disabling Interrupts Solves the Shared Data Problem from Figure 4.4 ~tatic int iTemperaturesI2];

Figure 4.8 Disabling Interrupts in Assembly Language

iTemperatures[O] iTemperatllres[l]

!! read in value from hardware If read in value from hardware

or ; disable interrupts whi 1 e we use the array
~10VE Rl, (iTemperature[O])
MOVE R2, (iTemperature[l])
E1 ; enable interrupts again
SlIBTRACT Rl. R2
JCOND ZERO, TEMPERATURES - OK void interrupt vReadTemperatures (void)

void main (void)

int iTempO. iTempl;

Code goes here to set off the alarm

wht l e (TRUE) {

TEMPERATURES_OK:

disable (); 1* Disable interrupts iTempO =: i Temperatures[O];

iTempl - tTemperatures[l];

enable ();

while we use the array *1

if (iTempO l= i lemp l )

II Set off howling alarm;

which interrupts must be disabled and write explicit code to do it when it is necessary.

----~------

"Atomic" and "Critical Section"

the code in Figure 4.7-a modification of the code in Figure 4.4--has no bug. The hardware can assert the interrupt signal requesting service, but the microprocessor will not jump to the interrupt routine while. the interrupts are disabled. Because of this, the code in Figure 4.7 always compares two temperatures that were read at the same time.

C compilers for embedded systems commonly have functions in their libraries to disable and enable interrupts, although they are not always called d ts ab : e and enabl e. In assembly language, you can invoke. the processor's instructions that enable and disable interrupts. (See Figure 4.8, a revision of Figure 4.6.)

Unfortunately, no C compilers or assemblers are smart enough to figure out when it is necessary to disable interrupts. You must recognize the situations in

A part of a program is said to be atomic if it cannot be interrupted. A more precise way to look at the shared-data problem is that it is the problem that arises when an interrupt routine and the task code share data, and the task code uses the shared data in a way that is not atomic. When we disable interrupts around the lines of the task code that use the shared data, we have made; that collection of lines atomic, and we have therefore solved the shared-data problem.

Sometimes people use the word "atomic" to mean not that a part of the program cannot be interrupted at all but rather to mean that it cannot be interrupted by. anything that might mess up the data itIs using. From the perspective of the shared-data problem, the two definitions are .equivalent. To solve its shared-data problem, the nuclear reactor program need only disable the interrupt that reads in the temperatures. If other interrupts change other data-the time of day, water pressures, steam pressures, etc.~while the task code is working with the temperatures, that will cause no problem.

98

INTERRUPTS

Figure 4.9 Interrupts with a Timer

static int iSeconds, iMinutes, iHours;

void interrupt vUpdateTime (void) (

++iSeconds;

if (iSeconds >-6U)

iSeconds - 0; ++iMinutes;

if (iMinutes )= 60)

iMinutes - 0; ++iHours;

if (iHours )= 24) iHours = 0;

If 00 whatever needs to be done to the hardware

long lSecondsSinceMidnight (void) (

return ( «(iHours * 60) + iMinutes) * 60) + iSeconds);

A set of instructions that must be atomic for the system to work properly is often called a critical. section.

A Few More Examples

In Figure 4.9 the function 1 SecondsSi nceMi dni ght returns the number of seconds since midnight. A hardware timer asserts an interrupt signal every second, which causes the microprocessor to run the interrupt routine vUpdateTime to update. the static variables that keep track of the time.

From OUf discussion above, you should see that the program in Figure 4.9 has an obvious bug. If the hardware timer interrupts while the microprocessor is doing the arithmetic in TSecondsS i nceMi dn i 9ht, then the result might be wrong. Suppose, however, that your application will run fine even if the

99

4.3 THE SHARED-DATA PROBLEM

1 SecondsSi nceMi dni gilt function sometimes returns a value that is one second off. Now is the program okay?

To answer this question, consider what might bea particularly perverse case.

We know that the return statement in lSecondsSinceMidnight must read the i Hours, i Mi nutes, and i Seconds variables one at a time from memory and that the interrupt routine may change any or all of those variables in the middle of that process. Suppose that the C compiler produces assembly code that reads the i Hours variable first, then the i Mi nutes, and then the i Seconds. (The ANSI C standard allows compilers to produce code that reads the three variables in any order that is convenient for the fellow who wrote the compiler.) Suppose that the time is 3:59:59. The function 1 SecondsSi nceMi dni ght might read i Hou rs as 3, but then if the interrupt occurs and changes the time to 4:00:00, lSecondsSinceMidnight will read iMinutes, and iSeconds as 0 and return a value that makes it look as though the time is 3:00:00, almost an hour off.

One way to fix this problem is to disable interrupts while 1 SecondsSi nceMi dn i ght does its calculation. Just don't do it this way, for obvious reasons:

long lSecondsSinceMidnight (void)

{

disable ();

return ( «(iHours * 60) + iMinutes) * 60) + iSeconds);

enable (); 1* WRONG: This never gets executed! *1

Better, do it like this:

long lSecondsSinceMidnight (void) {

long 1 ReturnVa 1 ;

disable ();

1ReturnVal -

«(iHours * 60) + iMinutes) * 60) + iSeconds;

enable ();

return (lReturnVal);

Best, do it as shown in Figure 4.10. A potential problem with the code above is that if l SecondsSi nceMi dn i ght is called from within a critical section somewhere else in the program, the function above will cause a bug by enabling interrupts

100 INTERRUPTS

Figure 4.10 Disabling and Restoring Interrupts long lSecondsSinceMidnight (void)

{

long 1 ReturnVal ;

BOOL fInterruptStateOld; 1* Interrupts already disabled? */

flnterruptstateOlrl - rlisable ();

1 ReturnVal -

«(iHours * 60) +.iMinutes) * 60) + iSeconds; /* Restore interrupts to previous state */

if (flnterruptStateOld)

enable ();

return (lReturnVal);

in the middle of that other critical section. Suppose that di sabl e, in addition to disabling interrupts, returns a Boolean variable indicating whether interrupts were enabled when it was called (which some C library functions do). Then the code in Figure 4.10, rather than enabling interrupts at the end of the routine, finds out whether interrupts were enabled at the beginning of the routine and then restores them to the same condition at the end. (A slight disadvantage is that the code in Figure 4.10 will run a little more slowly.)

Another Potential Solution

Figure 4.11 shows another potential solution to this problem, this time without disabling interrupts. What do you think of the code in Figure 4.11?

Consider again what causes the shared-data problem: the problem arises if the task code uses the shared variable in a nonatomic way. Does the return statement in 1 SecondsSi nceMi dn i ght use 1 SecondsToday atomically? It depends. If the microprocessor's registers are large enough to hold a long integer, then the assembly language of the entire 1 SecondsSi nceMi dni ght function is likely to be

MOVE RI. (lSecondsToday)

RETURN

4.3 THE SHARED-DATA PROBLEM 101

Figure 4.11 Another Shared-Data Problem Solution

static long int lSecondsToday;

void interrupt vUpdateTime (void) {

++lSecondsToday;

if (lSecondsToday -- 60 * 60 * 24) lSecondsToday - OL;

long lSecondsSinceMidnight (void) l

return.(lSecondsTOday);

which is atomic. If the microprocessor's registers are too small to hold a long integer, then the assembly language will be something like:

MOVE MOVE

RI. (lSecondsToday) R2. (lSecondsToday+l)

Get first byte or word Get second byte or word

RETURN

The number of MOVE instructions is the number of registers it takes to store the long integer. TI1lS is not. atomic, and it can cause a bug, because if the interrupt occurs while the registers are being loaded, you can get a wildly incorrect result.

Unless there is some pressing reason not to disable interrupts, it would he foolish to depend upon the mechamsm m FIgure 4.11 (() make your code work. Even if you are using a 32-bit microprocessor today, YOLl might port this code to a 16-bit microprocessor tomorrow. Better to disable interrupts when the function reads from the shared variable and keep the problem away for good. (The interrupt routine in FIgure 4.11 is more efficient than the one in Figure 4.9, however, and that efficiency causes no bugs. You might want to use the faster interrupt routine.)

102 INTERRUPTS

Figure 4.12 A Program That N coeds the vol at i 1 e Keyword

static long int lSecondsToday;

void interrupt vUpdateTime (void)

++lSecondsToday;

if (lSecondsToday ~ 60L * 60L * 24U lSecondsToday - OL;

long lSecondsSinceMidnight (void) {

long lReturn;

/* When we read the same value twice, it must be good. */ lReturn ~ lSecondsToday;

while (lReturn l= lSecondsToday) lReturn - lSecondsToday;

return (lReturn);

The vol at i 1 e Keyword

Most compilers assume that a value stays in memory unless the program changes it, and they use that assumption for optimization. This can cause problems. For example, the code in Figure 4.12 is an attempt to fix the shared-data problem in 1 SecondsSi nceMi dni ght without disabling interrupts. In fact, it is a fix that works, even on processors with 8~ and l o-bit registers (as long as the whi 1 eloop in 1 SecondsSi nceMi dni ght executes in less than one second, as it will on any microprocessor). The idea is that If 1 Seconds S i nceMi dn i ght reads the same value from 1 SecondsToday twice, then no interrupt can have occurred in the middle of the read, and the value must be valid.

Some compilers will conspire against you, however, to cause a new problem.

For this line of code

lReturn - lSecondsToday;

4-4 INTERRUPT LATENCY 103

the compiler will produce code to read the value ofl SecondsToday into one or more registers and save that (possibly messed up) value in 1 Return. Then when it gets to the wh i 1 e statement, the optimizer in the compiler WIll notice that it read the value of 1 SecondsToday once already and that that value is still in the registers. Instead of re-reading the value from memory, therefore, the compiler produces code to use the (possibly messed up) value III the regIsters, completely defeating the purpose of the onginal C program. Some optimizing compilers might even optimize the entire wh i 1 e-Ioop out of existence, theorizing that since the value of lReturn was just copied to lSecondsToday, the two must be equal and that the condition in the whi 1 e statement will therefore always be false. In either case, the optimizer in the compiler has reduced this new 1 SecondsSi nceMi dni ght to the same buggy version we had before.

To avoid this, you need to declare 1 SecondsToday to be volatile, by adding the volatile keyword somewhere in the declaration. The volatile keyword, part of the C standard, allows you to warn your compiler that certain variables may change because of interrupt routines or other things the compiler doesn't know about.

static volatile long int lSecondsToday;

With the vol at i 1 e keyword in the declaration the compiler knows that the microprocessor must read the value of 1 Seconds Tad ay from memory every time it is referenced. The compiler is not allowed to optimize reads or writes of 1 SecondsToday out of existence.

If your compiler doesn't support the vo 1 at i 1 e keyword, you should be able to obtain the same result by turning otT the compiler optimizations. However, it is probably a good idea in any case to look in the compiler output listing at the assembly language of touchy routines such as 1 SecondsSi nceMi dni ght to be sure that the compiler produced sensible code.

4.4 Interrupt Latency

Because interrupts are a tool for gettl!1g better response from our systems, and because the speed with which an embedded system can respond is always of interest, one obvious question is, "How fast does my system respond to each interrupt?" The answer to this question depends upon a number of factors:

1. The longest period of time during wbich that interrupt is (or all interrupts are) disabled

104

IN-rERRt:PTS

2. The period of time it takes to execute any interrupt routines for interrupts that

arc of higher priority than the one in question

3. it takes the llncroprocessor to stop what it is doing, do the necessary

and start cxccuung instructions within the interrupt routine

4. How long It rakes the interrupt routine to save the context and then do enough work that what it has accomplished counts as a "response"

The term interrupt latency refers to the amount of tnne it takes a system to to an mterrupt; however, different people include different combination- of the above factors when they calculate this number. In this book, we will include all of the above tacrors, but you will hear this term used to mean v.uious ditTcrcnt dungs.

The next obvious question is, "How do J get the times associated with the four fadem lisrcd above?' You can often find factor 3 by looking in the microprocessor documentation provided by the manufacturer. The other three items you c.in find in one of two ways. first, you can write the code and measure how long it takes to execute, as we will discuss further in Chapter 10. Second, vou can count the instructions of various types and look up in the microprocessor's documentation how long each type ofinstruction takes. This latter tcdmlCjue works reasonably well for the smaller microprocessors, since the rime it takes to do each instruction is deterministic, and the manufacturer c.in provide the data. It works far less well for microprocessors that cache instructions ahead of time: with these microprocessors, how long an instruction takes depends criricallv upon wherher the instruction was already in the cache and often upon several other unknowable factors as well.

Make Your Interrupt Routines Short

The four f;lctors mentioned above control inter rupt latency and, therefore, response. You deal WIth factor 4 by writing efficient code; we'll not discuss th.ir III this book, since the techniques are the same for embedded systems as for desktop systems. Factor 3 is not under software 20ntrol. Factor 2 is one of the rCdsons that it is generally a good Idea to write short interrupt routines. Processmg urne used by an interrupt routine slows respome for every other interrupt of the same or lower prionty. Although lower-priority interrupts .are presumably lower pr ior rty because their response time requirements are less critic.il, this is not necessarily license to make their response dreadful by writing a time-consuminginrerrupt routine for a higher-prioritv interrupt.

105

4-4 iN j EllRliP'l IATE"C:Y

For example, suppose that you're writing a system that controls a factory, and that every second your system gets two dozen interrupts to which it must respond promptly to keep the factory running smoothly. Suppose that your system monitors a detector that checks for gas leaks, and that your system must call the fire department and shut down the affected part of the factory if a gas leak is detected. Now it is very likely that the interrupt routine that handles gas leaks needs to be relatively high pnority, since it would probably be a bad idea for other interrupt routines to get the microprocessor's attention first, especially if those interrupt routmes open and close electrical switches and cause an explosion. However, the system needs to continue operating the unaffected part of the factory, so the gas leak interrupt routine must not take up too much time. If calling the fire department-a process that will take several seconds, at Ieast+-is included in the gas leak interrupt routine, then dozens of other interrupts will pile up while this is going on, and the rest of the factory may not runproperly. Therefore, the telephone call should probably not be part of the interrupt routine.

Disabling Interrupts

The rema11l11lg; factor that. contributes to interrupt latency. is the practice of disabling interrupts. Of course, disablmg interrupts is sometimes necessary in order to solve the shared-data probkm,as we discussed ill Section 4.3, but the shorter the period during which mlerrupts are disabled, the better your response will be.

Let us look at a few examples of how chsabling interrupts affects system response. Suppose that the requirements for your system are as follows:

• You have to disable interrupts for 125 microseconds (fLsec)f(Hyour task code to usc a pair of temperature variables it shares with the interrupt routine that reads the temperatures from the hardware and writes them into the variables.

• You have to disable interrupts for 250 usc«. for your task code to get the WIle accuratciy from variables it shares with the interrupt routmc that responds to the tuner interrupt.

I You must respond within (,25 ~IICC when you get a special SIgnal from another processor in your system: the interprocessor interrupt routmc takes 300 ILlec to execute.

Can this be made to work'

It is relatively easy to answer that question. Interrupts are disabled in our hypothetical system for at most 250 usee. at atime. The interrupt routine needs

106 INTFIlRUPTS

Figure 4.13 Worst Case Interrupt Latency

'risk code

disables interrupts.

/~

Processor gets to interprocessor ISR.

~ ~

r-------------~--

ISR docs critical work.

IRQ ..... -- .. -~__j

/1---' 250 Ilsec--1

I---- 300p.see ~

Interproressor

interrupt occurs.

I~~- Time to deadline: 625 Ilsec ----+oj

300 usee, for a total, worst-case time of 550 usee, within the 625-f.1sec limit. (See Figure 4.13.)

Note that the interrupt will never be delayed for .375 usee, the sum of the two periods of time during which interrupts are disabled. If the hardware asserts the interprocessor interrupt signal while the system has disabled mterruprs in order to read the time, then in at most 250 usee the system will rcenable the interrupts, and the microprocessor will jump to the interrupt routine. The fact that the system might at some other time disable the interrupts for another period of time is irrelevant. The interrupt routine will be executed as soon as the system reenables the interrupts. There is no way-at least on most microprocessorsto enable and then disable interrupts so fast that the microprocessor will not service the pending interrupts.

Suppose, however, that to cut costs, the hardware group proposes to replace the microprocessor with one that runs only half as fast. Now, all the processing times are doubled, interrupts are disabled for twice as long, the interrupt service routine takes twice as long, but the (,25-f.1sec deadline remains the same. Now will the system meet its deadline?

The answer is no. Interrupts will be disabled for up to 500 usee at a time, and the interrupt service routine needs 600 usee to do its work. The total of these two is 1100 usee, muchlonger than the 625-11sec deadline.

.~~~~~---- .. -~.~~~-~~~~~~~~~~-

4.4 INTERRlIPT LATENCY 107

Figure 4.14 Worst Ca~e Interrupt Latency

Processor to

inter processor ISI<.._.

ISR does critical work.

Task code

disables interrupts.

Processor gets network ISR.

'-------~-------~~

IRQ~~~

Interprocessor/ L_ ... ~I

1- ~ 250 Ilsee --, interrupt

--1

r- IOU usee

f.--- 3UU Ilsec ~

occur,').

~---- Time to deadline: 625 Il,ee -----I

Suppose that we manage to talk the hardware group out of the idea of the slower processor, but now the marketing group wants to add networking capability to .the system. Suppose that the interrupt routine for the network hardware will take 100 usee to do its work. Will the system respond to the interprocessor interrupt quickly enough?

It depends. If you can assign the network interrupt a lower priority than the interprocessor interrupt (and if the microprocessor can still service the network interrupt quickly enough), then the network interrupt has no effect on the response of the interprocessor interrupt, which will therefore still be fast enough. However, if the network interrupt has a higher priority, then the time taken by the network interrupt routine adds to the interrupt latency for the interprocessor interrupt and runs it beyond the deadline. (See F1gureA.14.)

Alternatives to Disabling Interrupts

Since disabling interrupts increases interrupt latency, you should know a few alternative methods for dealing with shared data. In this section, we will discuss

108

I!,;TEHllUPI S

Figure 4.15 Avoiding Disabling Interrupts static int iTemperaturesA[2]:

static int iTemperaturesB[21:

static BOOl fTaskCodeUsingTempsB - FALSE;

void interrupt vReadTemperatures (void) {

if (fTaskCodeUsingTempsB)

iTemperaturesA[O] iTemperaturesA[l]

II read in value from hardware; II read in value from hardware;

else

iTemperaturesB[O] iTemperaturesB[l]

II read in value from hardware; II read in value from hardware;

void main (void)

while (TRUE)

If (fTaskCodeUsingTempsB)

if (iTemperaturesB[O] !- iTemperaturesB[l}) II Set off howling alarm;

else

if (iTemperaturesA[O] !- iTemperaturesA[l]) 11 Set off howling alarm;

fTaskCodeUsingTempsB - I fTaskCodeUsingTempsB:

a few examples. Because III most cases simply disabling interrupts is more robust than the techniques discussed below, you should use them only for those dire situations III which you can't afford the added latency: All of the examples in this section have been very carefully crafted; very small changes can introduce disastrous bugs.

The program in Figure 4.15 maintains two sets of temperatures, one in the iTemperatureSA array 'and the other 111 the iTemperaturesB array. The

109

4.4 INTERRUPT LATENCY

fTas kCodeUs"1 nc Iemps 8 variable keeps track of which array the task code is currently examining. The interrupt routine always writes to whichever set the task code is not using. This simple mechanism solves the shared-data problem, because the interrupt routine will never write into the set of temperatures that the task code is reading. (Needless to say, in production code you would probably use a two-dimensional array; we used two arrays in this example to make it obvious what was going on.)

The disadvantage of this code is that the wh i I e-loop in ma i n may be executed twice before it sets off the alarm, because the task code may check the wrong set of temperatures first.

Now examine Figure 4.16. In this version of the program, the interrupt routine writes pairs of temperatures to the iTemperatureQueue queue. Because the i Head pointer and the iTail pointer ensure that the interrupt routine will be writing to different locations in the queue than the ones from which the task code is reading, the shared-data problem with the temperatures themselves is eliminated. At the expense of quite a bit of complication, this code gets the temperature data to the task code without disabling interrupts.

Figure 4.16 A Circular Queue Without Disabling Interrupts #define QUEUE_SIZE 100

int iTemperatureQueue[QUEUE_SIZE];

int iHead - 0; /* Place to add next item */

int iTail - 0: /* Place to read next item */

void interrupt vReadTemperatures (void) {

/* If the queue is not full ... *1

if (1« iHead+2--iTail) II (iHead--QUEULSIZE-2 && iTail--O))) {

1TemperatureQueue[iHeadJ - Ilread one temperature; iTemperatureQueue[iHead + 1] - !!read other temperature; iHead +- 2;

if (iHead -- QUEUE_SIZE) iHead - 0;

else

!Ithrow away next value

(continued)

110

iN1-'ERRUPT-S

Figure 4.16

void main (void) {

int iTemperature1. iTemperature2;

while (TRUE) (

1* If there is any data ... *1 if (iTai1 !~ iHead)

{

iTemperature1~ iTemperatureQueue[iTai1]; iTemperature2~ iTemperatureQueue[iTai1 + 1]; iTa i 1 +s= 2;

if (iTai1 ~~ QUEUE_SIZE) iTai1 ~ 0;

11 Do something with iVa)ue;

The disadvantage of the code in Figure 4.16 is that it is very fragile. Either of these seemingly minor changes can cause bugs:

The task code must be SUIT to read the data from the queue first and move the tail pointer second. Reversing these two operations would allow the interrupt routine to write into the queue at the location from which the task code is reading and cause a shared-data bug.

When the iTa i 1 is incremented by two in the task code. the write to that variable must be atomic. This is almost certain to be true, but if you are using an S-bit processor and your array is larger than 256 entries long, it might not be, If the modification of the tail pointer is not atomic, then a potential bug lurks in this program.

Because of the fragility of this code, it would make sense to write it this way only if disabling interrupts is really not an option.

CHAPTER SUMMAHV

111

Chapter Summary

I Some characteristics of assembly language are the following

• Each instruction translates into one microprocessor instruction, unlike C.

• Instructions move data from memory to registers within the microprocessor, other instructions indicate operations to be performed on the data in the registers, and yet other instructions move the data from registers back into memory.

• Typical assembly languages have jump instructions and conditional jump instructions, call and return instructions, and instructions to put data on and remove data from a stack in the memory.

• When an I/O device signals the microprocessor that it needs service by asserting a signal attached to one of the microprocessor's interrupt request pins, the microprocessor suspends whatever it is doing and executes the corresponding interrupt routine before continuing the task code.

M Interrupt routines must save the context and restore. the context.

I Microprocessors allow your software to disable interrupts (except for the 110nmaskable interrupt) when your software has critical processing to do.

~ When interrupt routines and task code share data, you must ensure that they don't interfere with one another: The first method for doing this is to disable interrupts while the task code uses the shared data.

I A set of instructions that must not be interrupted if the system is to work properly is called a 'critical section. A set of instructions that will not be interrupted (because, for example, interrupts are disabled) is said to be atomic.

II You should not assume that any statement in C is atomic.

I The vol atil e keyword warns the compiler that an interrupt routine might change the value of a variable so that the compiler will not optimize your code in a way that will make it fail.

II Interrupt latency is the amount of time it takes a system to respond to an interrupt. Several factors contribute to this. To keep your interrupt latency low (and your response good) you should

• Make your interrupt routines short.

• . Disable interrupts for only short periods of time.

• Although there are techniques to avoid disabling interrupts, they are tt'agile and should only be used if absolutely necessary.

112

----,'----

IN-lLRFLTP-r~

Problems

1. The interrupt routine shown in Figure 4.17 is the same one that we discussed Il1 the text. Now someone has written a subroutine to change the time zone by changrng the i Hours variable. The subroutine takes into account the difference in the two time zones and then makes adjustments to deal with the fact that one or both of the two time zones may currently be observing daylight savings time.To reduce the period during which this subroutine must disable interrupts, the subroutine copies the i Hou rs variable into the local, nonsharcd i Hour sTemp variable, does the calculation, and copies the final result back at the end. Does this work?

2. Figure 4.11 has a shared data bug when the registers in the microprocessor are not as large as the data space needed to store a long integer. Suppose that long integers are 32 bits long and that your microprocessor has l o-bit registers. How far off can the result ofl Seconds Si neeMi dn i 9 ht be? What if your microprocessor has S-bit registers?

3. Even if your microprocessor has 32-bit registers, Figure 4.11 has another potential subtle bug. This bug will show itself if your system has an interrupt that is higher priority than the timer interrupt that corresponds to updateTime and if the interrupt routine for that higber-priorirv interrupt uses 1 SecondsSi nceMidnight. What is this bug, and how might you fix It?

4. If we change the problem in Figure 4.14 so that the networking interrupt is a . lower-priority interrupt and if we assume that the intcrprocessor interrupt takes 350 ILsec, then wh.it is the worst-case interrupt latency for the networking interrupt?

5. The task code and the interrupt routine in Figure 4.15 share the variable fTaskCodeUsingTempsB. Is the task code's use of that variable atomic' Is it necessary for it to be atomic for the system to work?

6. Figure 4.18 is another endeavor to write queuing functions without disabling interrupts. Even assuming that all of the writes to the variables are atomic, a very nasty bug is hiding in this program. What is it?

113

PROBI.EMS

Figure 4.17 Reducing Time During Which Interrupts Are Disabled static int iSeconds. iMinutes. iHours;

void interrupt vUpdateTime (void)

{

++iSeconds;

if (iSeconds >= 60)

iSeconds = 0; ++iMinutes;

if (iMinutes >= 60)

iMinutes = 0; ++iHours;

if (iHours >- 24) i Hours = 0;

II Deal with the hardware

void vSetTimeZone (int iZoneOld, int iZoneNew) (

int iHoursTemp:

1* Get the current 'hours' of the time *1 disable ();

iHoursTemp = iHours;

enable ();

1* Adjust for the new time zone. */

iHoursTemp = iHoursTemp + iZoneNew---iZoneOld;

/* Adjust for daylight savings time, since not all places in the world go to daylight savings time at the same time. */ if (flsDaylightSavings (iZoneOld))

++iHoursTemp;

if (fIsDaylightSavings (iZoneNew)) --iHoursTemp;

/* Save the new 'hours' of the time */ disable ();

iHours - iHoursTemp; enable ():

----------- ,~----

114 INTERRUPTS

Figure 4.18 A Queue That Doesn't Quite Work int iQueue[lOO]; int iHead 0; int iTail - 0;

1* Place to add next item *1 1* Place to read next item *1

void interrupt SourceInterrupt (void) {

1* If the queue is full. . *1

if «iHead+l -- t Ta t l ) II (iHead -- 99 && iTail 0))

1* ... throwaway the oldest element. *1 ++iTai 1 ;

if (iTail -- 100) iTail - 0;

iQueue[iHead] - Iinext value; ++iHead;

if (iHead -- 100) iHead ~ 0;

void SinkTask (void)

int iValue; while (TRUE)

if (iTail !- iHead) {

iValue - iQueue[iTail]: ++iTai 1 ;

if (iTail -- 100) iTail - 0;

II Do something with iValue;

5.1

Survey of Software Architectures

In this chapter we will discuss various architectures for embedded softwarethe basic structures that you can use to put your systems together.

The most important factor that determines which architecture will be the most appropriate for any given system is how much control you need to have over system response. How hard it will be to achieve good response depends not only on the absolute response time requirements but also on the speed of your microprocessor and the other processing requirements. A system with little to do whose response-rime requirements are few and not particularly stringent can be written with a very simple architecture. A system that must respond rapidly to many different events and that has various processing requirements, all with different deadlines and different priorities, will require a more complex architecture.

We will discuss four architectures, starting with the simplest one, which offers you practically no control of your response and priorities, and moving on to others that give you greater control but at the cost of increased complexity. The four are round-robin, round-robin with interrupts, function-queue-scheduling. and real-time operating system. At the end of the chapter are a few thoughts about how you might go about selecting an architecture.

Round-Robin

The code in Figure 5.1 is the prototype for round-robin, the simplest imaginable architecture. There are no interrupts. The main loop simply checks each of the I/O devices in turn and services any that need service.

116 SURVEY OF SOl'TWARE AHCHITECTGiHiS

Figure 5.1 Round-Robin Architecture

void main (void)

while ("(RUE)

if (I! 110 Device A needs service) {

!I Take care of 110 Device A

II Handle data to or from liD Device A

if (II liD Device B needs service) {

11 Take care of lID Device 8

!I Handle data to or from liD Device 8

etc. etc.

if (11 liD Device Z needs service) {

!I Take care of 110 Device Z

!I Handle data to or from 110 Device Z

This is a marvelously simple architecture--no interrupts, no shared data, no latency concerns-and therefore always an attractive potential architecture, as long as you can get away with it.

Simple as it is, the round-robin architecture is adequate for some jobs. Consider, for example. a digital multimeter such as the one shown in Figure 5.2. A digital multimeter measures electrical resistance, current, and potential in units of ohms, amps, and volts, each in several different ranges. A typical multimeter has two probes that the user touches to two points on the circuit to be measured, a digital display, and a big rotary switch that selects which measurement to make and in what range. The system makes continuous measurements and changes the display to reflect the most recent measurement.

Possible pseudo-code for a multimeter is shown in Figure 5.3. Each time around its loop, it checks the position of the rotary switch and then branches to code to make the appropriate measurement, to format its results, and to write

117

).1 RO{;l"D-ROBIN

Figure 5.2 Digital Multimeter

. 28.64

Probes

100

the results to the display. Even a very modest microprocessor can go around this loop many times each second.

Round-robin works well for this system because there are only three I/O devices, no particularly lengthy processing, and no tight response requirements. The microprocessor can read the hardware that actuallv makes the measurements at any time. The display can be written to at whatever speed is convenient for the microprocessor. When the user changes the position of the rotary switch, he's unlikely to notice the few fractions of a second it takes for the microprocessor to get around the loop. (In many cases the user is probably so busy repositioning the probes that he would not notice even a fairly lengthy delay; the user has only two hands, and if one of them is turrung the rotary switch, then one of the probes is probably lying on his bench.) The round-robin architecture is adequate to meet all of these requirements, and Its SImplicity makes it a very attractive choice for this system.

Unfortunately, the round-robin architecture has only one advantage over other architectures-simplicity- -whereas it has a number of problems that make it inadequate for many systems:

118

SURVEYOr SOFTWARE /\RCHITECTtJRES

Figure 5.3 Code for Digital Multimeter

void vDigitalMultiMeterMain (void) (

enurn (OHMS_I, OHMS_10, ... VOLTS_IOO) eSwitchPosition;

whi 1e (TRUE) (

eSwitchPosition

! Read the position of the switch;

switch (eSwttchPosition) l

case OHMS 1:

11 Read hardware to measure ohms 1 I Format result

break;

case OHMS 10:

11 Read hardware to measure ohms 1 I Format result

break;

case VOLTS_IOO:

11 Read hardware to measure volts I I Format result

break;

!! Write result to display

I f anyone device needs response in less time than it takes the microprocessor to get around the main loop in the worst-case scenario, then the system won't work. In Figure 5.1, for example, if device Z can wait no longer than 7inilliseconds for service, and if the pieces of code that service devices A and B take 5 milliseconds each, then the processor won't always get to device Z quickly enough. Now you can squeeze Just a little more out of the round-robin architecture by testing device A, then Z, then D, then Z. and so on, but there is a limit to how much of this you can do. The world is full ofI/O devices that need fairly rapid service: serial ports. network ports e, push buttons, etc.

5 .. 2

5.2 ROUl"-i~-RoBIN WIlli l~nERRuPT~

119

I Even if none of the required response times are absolute deadlines, the system may not work well if there is any lengthy processing to do. for example, if any one of the cases in Figure 5.3 were to take, say, 3 seconds, then the system's response to the rotary switch may get as bad as 3 seconds. This may not quite meet the definition of "not working," but it would probably not be a system that anyone would be proud to ship.

I This architecture is fragile. Even if you manage to tune it up so that the microprocessor gets around the loop quickly enough to satisfy all the requirements, a single additional device or requirement may break everything.

Because of these shortcomings, a round-robin architecture is probably suitable only for very simple devices such as digital watches and microwave ovens and possibly not even for those.

Round-Robin with Interrupts

Figure 5.4 illustrates a somewhat more sophisticated architecture, which we will call round-robin with interrupts. In this architecture, interrupt routines deal with the very urgent needs of the hardware and then set flags; the main loop polls the flags and does any follow-up processing required by the interrupts.

This architecture. gives you a little bit more control over priorities. The interrupt routines can get good response, because the hardware interrupt signal causes the microprocessor to stop whatever it is doing in the rna in function and execute the interrupt routine instead. Effectively, all of the processing that you put into the 'interrupt routines has a higher priority than the task code in the main routine. Further, since you can usually assign priorities to the various interrupts in your system, as we discussed in Chapter 4, you can control the priorities among the interrupt routines as well.

The contrast between the priority control YOLi have with round-robin and with round-robin with interrupts is shown in Figure 5.5. This contrast is the principal advantage of using interrupts rather than a pure round-robin architecture. The disadvantage is that fDevi eeA, fDevi ceB, fDevi eeZ, and who knows what other data in Figure 5.4 arc shared between the interrupt routines and the task code in rna in, and all of the shared-data problems can potentially jump up and bite YOll. Once committed to this architecture, you are committed to using the various techniques that we discussed in Chapter 4 for dealing with shared data.

120 SURVEY OF SOFTWARE ARCHITECTURES

Figure 5.4 Round-Robin with Interrupts Architecture

BOOl fDeviceA - FALSE:

BOOl fDeviceB - FALSE:

BOOl fDeviceZ - FALSE:

void interrupt vHandleDeviceA (void)

ii Take care of liD Device A fDeviceA = TRUE:

void interrupt vHandleDeviceB (void)

1 i Take care of JlO Device 8 fDev-i ceB - TRUE;

void interrupt vHandleDeviceZ (void)

! 1 Take care of I/O Device Z fDeviceZ - TRUE:

void main (void)

while (TRUE)

if (fDeviceA) {

fDeviceA - FALSE:

11 Handle data to or from 110 Device A

if (fDeviceB)

fDeviceB - FALSE;

11 Handle data to or from liD Device 8

(continued)

121

5-2 ROUND-RoBIN WITH biTERHUPTS

Figure 5.4 (continued)

if (fDeviceZ) {

fDeviceZ ~ FALSE;

11 Handle data to or from 110 Device Z

Figure 5.5 Priority Levels for Round-Robin Architectures

Round-robin

Round-robin with interrupts

High-priority processing

Device Z ISR All Task Code

, Low-priority processing

Round-Robin-with-Interrupts Example:

A Simple Bridge

The round-robin-with-interrupts architecture is suitable for many systems, ranging from the fairly simple to the surprisingly complex. One example at the simple end of the range is a communications bridge, a device with two ports on it that forwards data craftic received on the first port to the second and vice versa. Let's suppose for the purpose of this example that the data on one of the ports is encrypted and that it is the Job of the bridge to encrypt and decrypt the data as it passes it through. Such a device is shown in Figure 5.6.

122

SCRVEY OF SOFTWARE ARCHl'lECTURES

Figure 5.6 Communications Bridge

Data forwarded from II to A.

Link A

Data forwarded from A to B.

L _

Let's make the following assumptions about the bridge:

Conununicatiori LinkB (with encrypted data)

I Whenever a character is received on one of the communication links, it causes an interrupt, and that interrupt must be serviced reasonably quickly, because the microprocessor must read the character out of the I/O hardware before the next character arrives.

I The microprocessor must write characters to the I/O hardware one at a time.

After the microprocessor writes a character, the I/O transmitter hardware on that communication link will be busy while it sends the character; then it will interrupt to indicate that it is ready for the next character. There is no hard deadline by which.the microprocessor must write the next character to the I/O hardware.

I We have routines that will read characters from and write characters to queues and test whether a queue is empty or not. We can call these routines from interrupt routines as wei las from the task code, arid they deal correctly with the shared-delta problems.

I The encrypuon routine can encrypt characters one at a time, and the decryption routine can decrypt characters one at a time.

Possible code for a very simple bridge is shown in Figure 5.7. In this code the microprocessor executes the interrupt routines vGotCharacterOnl inkA and vGotChar'acterOnLi nkBwhenever the hardware receives a character. The interrupt routines read the characters from the hardware and put them into the queues qDataFroml i nkA and.qDataFroml i nkB. The task code in the main routine

~~-- _-----_. __ -._- - ------

5.2 ROUNl)-RoBIN WJ J H JNTERRUPTS

123

----

Figure 5.7 Code fora Simple Bridge

'define QUEUE SIZE 100

typedef struct {

char chOueue[QUEUE_SIZE];

int iHead; int iTail; QUEllt;

1* Place to add next Hem *1 1* Place to read next item *1

static QUEUE qDataFromlinkA;
static QUEUE qDataFromLinkB;
static QUEUE qDataToLinkA;
static QUEUE qDataToL i nkB;
static BOOl fLinkAReadyToSend - TRUE;
static BOOL fLinkBReadyToSend TRUE; void interrupt vGotCharacterOnLinkA (void) (

char eh;

en - !! Read character from Communications Link A; vOueueAdd (&qDataFromLinkA. ch);

void interrupt vGotCharacterOnLinkB (void)

char eh;

ch - !! Read character from Communications link B; ~true.ueAdd (&qDataFromLinkB, ch)';

void interrupt vSentCharacterOnLinkA (void) [

fLinkAReadyToSend - TRUE;

void interrupt vSentCharaeterOnLinkB (voj~)

fLinkBReadyToSend - TRUE;

(continued}

124 SURVEYOr SOFTWARE ARCHITECTURES

Figure 5.7 (continued)

void main (void)

char eh;

/* Initialize the queues */ vOueuelnitialize (&qDataFromlinkA); vQueuelnitialize (&qDataFromLinkB); vQueuelnitialize (&qDataToLinkA); vQueuelnitialize (&qDataToLinkB);

/* Enable the interrupts. */ enable ();

while (TRUE) {

vEncrypt (); vDecrypt ();

if (fLinkAReadyToSend II fOueueHasData(&qDataToLinkA) {

eh - chQueueGetData (lqDataToLinkA); disable ();

II Send ch to Link A fLinkAReadyToSend = FALSE; enable ();

if (fLinkBReadyToSend II fOueueHasData (&qDataToLinkB»

ch = ehQueueGetData (&qDataToL i nkB); disable ();

II Send ch to Link B fLinkBReadyToSend - FALSE; enable ();

void vEnerypt (void) {

char chClear; char chCryptic;

(continued)

5.2 1l0UND-RoBIN WITH ]"TERRUPTS 125

Figure 5.7 (continued)

/* While there are characters from port A while (fQueueHasData (AqDataFromLinkA»

{

.. ' * /

/* . , , Encrypt them and put them on queue for port B */ chClear - chQueueGetData (&qDataFromLinkA);

chCryptic = !! Do encryptiol1 (this code is a deep secret) vOueueAdd (&qDataToL i nks , chCrypt i c) ;

void vDecrypt (void)

char chClear; cha r chCrypt i c;

1* While there are characters from port B ... *1 while (fQueueHasData (&qDataFromLinkB»

{

1* , . , Decrypt them and put them on queue for port A *1 chCryptic - chQueueGetData (lqDataFromLinkB);

chClear - !! Do decryption (no one understands this code) vQueueAdd (&qDataToLinkA. ehClear);

calls vEncrypt and vDecrypt, which read these queues, encrypt and decrypt the data, and write the data to qDataToL i nkA and qDataToL i nkB. The main routine polls these queues to see whether there is any data tobe sent out. The queues are shared. but the queue routines are written to deal with the shared-data problems.

The two variables Hi nkAReadyToSend and fL i nkBReadyToSend keep track of whether the I/O hardware is ready to send characters on the two communications links. Whenever the cask code sends a character to one of the links, it sets the corresponding variable to FALSE, because the lIO hardware is now busy. \Vhen the character has been scm, the liO hardware WIll interrupt. and t!;einterrupt routine sets the var iable to TRUE. Note that when the task code writes to the hardware or to these var iablcs, it must disable mterrupts to avoid the shared-data problem.

126

SURVEyOr SOFTWARE ARCHITECTURES

The interrupt routines receive characters and write them to the queues; therefore, that processing will take priority over the process of moving characters among the queues, encrypting and decrypting them, and sending them out. In this way a sudden burst of characters will not overrun the system, even if the encryption and the decryption processes are time-consuming.

Round- Robin-with- Interrupts Example:

The Cordless Bar-Code Scanner

Similarly, the round-robin-with-interrupts architecture would work well for the cordless bar-code scanner introduced in Chapter 1. Although more complicated than the simple bridge in Figure 5.7, the bar-code scanner is essentially a device that gets the data from the laser that reads the bar codes and sends that data. out on the radio. In this system, as in the bridge, the only real response requirements are to service the ha~dware quickly enough. The task code processing will get done quickly enough in a round-robin loop.

Characteristics of the Round-Robin-with-Interrupts Architecture

The primary shortcoming of the round-robin-with-interrupts architecture (other than that it is not as simple as the plain round-robin architecture) is that all of the task code executes at [he same priority. Suppose that the parts of the task code in Figure 5.4 that deal with devices A, B, and C take 200 milliseconds each. If devices A, B, and C all interrupt when the microprocessor is executing the statements at the top of the loop, then the task code for device C may have to wait for 400 milliseconds before it starts to execute.

If this is not acceptable, one solution is to move the task code for device C into the interrupt routine for device C. Putting code into interrupt routines is the only way to get it to execute at a higher priority under this architecture. This, however, will make the interrupt. routine for device C take 200 milliseconds more than before, which increases the response times for the interrupt routines for lower-priority devices D, E, and F by 200 milliseconds, which may also be unacceptable.

Alternatively, you could have your main loop test the flags' for the devices in a sequence something like this: A, C, S, C, D, C, E, C, ... , testing the flag for device C .rnorefrequently than the flags for the other devices, much as we suggested for the round-robin architecture. Thls will improve the response for the task code for device C .. at the expense of the task code for every other

127

device. Sometimes you can balance your response time requirements with this technique, but it is often more trouble than it is worth, and it will be fragile.

In general, the worst-case response for the task code for any given device occurs when the interrupt for the given device happens just after the roundrobin loop passes the task code for that device, and every other device needs service. If rna i n in Figure 5.4 has just checked the fOevi ceA flag and found it to be FALSE when device A interrupts, then rnai n will get around to dealing with the data from device A right after It has dealt with any data from devices S, C, D, and so on up to Z and then comes back to the top of the loop. The worst-case response is therefore the sum of the execution times of the task code for every other device (plus the execution times of any interrupt routines that happen to occur, which we assume are short).

Examples of systems for which the round-robin-with-interrupts architecture does riot work well include the following ones:

• A laser printer. As we discussed in Chapter 1, calculating the locations where the black dots go is very time-consuming. If you lise the round-robin-withinterrupts architecture, the only code that will get good-response is code in interrupt routines. Any task code may potentially be stuck while the system calculates more locations for black dots. Unfortunately, a laser printer may have many other processing requirements, and if all the code goes into interrupt routines, it becomes impossible to make sure that the low-priority interrupts are serviced quickly enough.

I The underoround tank-monitoring system. The tank-monitoring system like the laser printer has a processor hog: the code that calculates how much gasoline is in the tanks. To avoid putting all the rest of the code into interrupt routines, a more sophisticated architecture is required for this system as well.

5 .. 3 Function-Queue-Scheduling Architecture

Figure S.il shows another, yet more-sophisticated architecture, what we will call the function-queue-scheduling architccrure. In this architecture, the interrupt routines add function pointers to a queue of function pointers for the rna in function to call. The rna in routine just reads pointers from the queue and calls the functions.

1"28

OF SOrTW ARE ARCHITECTURES

Figure 5.8 Function-Queue-Scheduling Architecture II Queue of function.pointers;

void interrupt vHandleDeviceA (void) {

I I Take care of I/O Device A

I I Put functionA on queue of function painters

void interrupt vHandleDeviceB (void)

II Take care of I/O Device B

I I Put function_B on queue of function pointers

void main (void)

while (TRUE) {

while (! !Queue of tunct.i on painters is empty)

I! Call first function on queue

void function_A (void)

1 I Handle actions required by device A

void function_B (void) (

11 Handle actiolls required by device B

What makes this architecture worthwhile is that no rule says rna i n has to call the functions.in the order that the interrupt routines occurrec1. It can call them based on any priority scheme that suits your purposes. Any task code functions

5 .. 4

-------c- ... - __ .... _--

5.4 REAL-TIME OPFRATlNG SYSTEM i\RCHITECTl)RE

129

that need quicker response can be executed earlier. All this takes is a little clever coding in the routines that queue up the function pornters.

In this architecture the worst wait for the highest-priority cask code function is the length of the longest of the task code function- (again, plus the execution times of any interrupt routines that happen to occur). This worst case happens if the longest task code function has just started when the interrupt for the highest-priority device occurs. This is a rather better response than the round-robin-with-interrupts response, which, as we discussed, is the sum of the times taken by all the handlers. The trade-off tor this better response-in addition to the cornplication+-is that the response for lowerpriority task code functions may get worse. Under the round-robin-withinterrupts architecture, all of the task code gets a chance to run each time main goes around the loop. Under this architecture, lower-priority functions may never execute if the interrupt routines schedule the higher-pr ioritv functions frequently enough to use up all of the microprocessor's available time.

Although the function-queue-scheduling architecture reduces the worstcase response for the high-priority task code, it may still not be good enough because if one of the lower-priority task code functions is quite long, it will affect the response for the higherpr ioritv functions. In some cases you can get around this problem by rewriting long functions in pieces, each of which schedules the next piece by adding it to the function queue, but this gets complicated. These are the cases which call for real-time operating system architecture.

Real-Time Operating System Architecture

The last architecture, the one that we will discuss in detail in Chapters 0, 7, and fI, is the architecture that uses a real-time operating system. We'll discuss sophisticated uses of this architecture III the later chapters; a very simple sketch of how it works is shown in Figure 5.9.

In this architecture, as in the others that we have been discussing, the interrupt routines take care of the most urgent operations. They then "signal" that there is work for the task code to do. The differences between this architecture and the previous ones are that:

130 SURVEY OF POPTWARE ARCHITECTURES

Figure 5.9 Real-Time-Operating-System Architecture void interrupt vHandleDeviceA (void)

{

II Take care of 110 Device A 11 Set s i qns i X

void interrupt vHandleDeviceB (void)

11 Take care of 110 Device B !! Set s t qns l Y

void Taskl (void) {

while (TRUE) (

11 Wait for Signa7 X

11 Hand7e data to or from I/O Device A

void Task2 (void) {

while (TRUE)

11 Wait for Signa7 Y

11 Handle data to or from 110 Device B

The necessary signaling between the interrupt routines and the task code is hanclled by the real-time operating system (the code for which is not shown in Figure 5.9). You need not use shared variables for this purpose.

• No loop in our code decides what needs to be done next. Code inside the realtime operating system (also not shown in Figure 5.9) decides which of the task

~~-.-----.-~--~--~-

5-4 REAT.- TIME OPERATING SYSTEM ARCHITECTURE

131

Round-robin

Figure 5.10 Priority Levels for Real-Time-Operating-Systcm Architectures

Round-robin with interrupts

High-priority processing

Device /1 rSR

[-Everything J

j

Low-prioritv processIng

Real-time operating system

r~------,

Device .1\ ISR

code functions should run. The real-time operating system knows about the various task-code subroutines and will run whichever of them is more urgent at any given time.

R The real-time operating system can suspend one task code subroutine in the middle of its processing in order to run another.

The first two of these differences are mostly programming convenience. The last one is substantial: systems using the teal-time-operating-system architecture can control task code response as well as interrupt routine response. If Ta s kl is the highest priority task code in Figure 5.9, then when the interrupt routine vHandl eDevi ceA sets the signal X, the real-time operating system will run Taskl immediately. If Ta s k2 is in the middle of processing, the real-time operating system will suspend it and runTa s k l instead. Therefore, the worst-case wait for the highest-priority task code is zero (plus the execution time for interrupt routines). The possible priority-levels for a real-time operating system architecture is shown in Figure 5.10.

A side-effect of this scheduling mechanism is that your system's response will be relatively stable, even when you change the code. The response times for a task code function in the round-robin architectures and in the function-

132 SURVEYOr SOFTWARE ARCHITECTURES

queue architecture depend upon the lengths of the various task code subroutines, even Iower-pr iorirv ones. When you change any subroutine, you potentially change response times throughout your. system. In the real-time-operatingsystem architecture, changes to lower-priority functions do not generally affect the response of higher-priority functions.

Another advantage of the real-time-operating-system architecture is that real-time operating systems are widely available for purchase. Oy buying a realtune operating system, you get immediate solutions to some of your response problems, You typically get a useful set of debugging tools as well.

[he primary disadvantage of the real-time-operating-system architecture (other than havmg to pay for the real-time operating system) is that the realtime operating system itself uses a certain amount of processing time. You arc getting better response at the expense of a little bit of throughput.

We will discuss much more about real-time operating systems in the next several chapters. In particular, we will discuss what they can do for you, how you can usc them effectively, and how you can avoid some of their disadvantages.

5 .. 5

Selecting an Architecture

Here are a few suggestions about selecting an architecture for your system:

II Select the simplest architecture that will meet your response requirements.

Writing embedded-system software is complicated enough without choosing an unnecessarily complex archirecture for your software, (However, do remember that the requirements for version 2 will no doubt be more stringent than those for version 1.)

I If your system has response requirements that might necessitate using a real-time operating system, you should lean toward using a real-time operating system. Most commercial systems arc sold with a collection ofuseful tools that will make it easier to test and debug your system.

I If it makes sense for your system, you can create hybrids of the architectures discussed in this chapter. For example, even if you are using a real-time operating system, you can have a low-priority task that polls those parts of the hardware that do not need fast response. Similarly, in a round-robin-with-interrupts architecture, the main loop can poll the slower pieces of hardware directly rather than reading flags set by interrupt routines.

---- ..• ------

CHAPTER SUMMARY

133

Chapter Summary

Response requirements most often drive the choice of architecture.

The characteristics of the four architectures discussed are shown in Table 5.1. Generally, you will be better off choosing a simpler architecture.

One advantage of real-time operating systems is that you can buy them and thereby solve some of your problems without having to write the code yourself.

Hybrid architectures can make sense for some systems.

Table 5.1

Characteristics of Various Software Architectures

Worst Response Stability of
Priorities Time for Response When
Available Task Code the Code Changes Simplicity
Round-robin None Sum of all task Poor Very simple
code
Round-robin Interrupt Total of Good for Must deal with
with interrupts routines in execution interrupt data shared
priority order, time for all routines; poor between
then all task task code (plus for task code mterrupt
code at the execution time routines and
same priority for interrupt task code
routines)
Function- Interrupt Execution time Relatively Must deal with
routines in for the longest good shared data
queue-
scheduling priority order, function (plus and must write
then task code exer utio n time function queue
in priority for interrupt code
order routines)
Real-time Interrupt Zero (plus Very good Most complex
operating routines in execution time (although
svstcm priority order, for interrupt much of the
then task code routines) complexity
In priority is inside the
order operating
system itself) 134 SURVEY OF SOFTWARE ARCHITECTURES

Problems

1. Consider a system that controls the traffic lights at a major intersection. It reads from sensors that notice the presence of cars and pedestrians, it has a timer, and it turns the lights red and green appropriately. What architecture might you use for such a system' Why? What other information, if any, might influence your decision?

2. Reread the discussion of the Telegraph system in Chapter 1. What architecture might you usc for such a system? Why?

3. Consider the code in Figure 5.11. To which of the architectures that we have discussed is this architecture most similar in terms of response?

4. Virite C code to implement the function queue necessary for the functionqueue-scheduling architecture. Your code should have t\\I"O functions: one to add a function pointer to the back of the queue and one to read the first item from the front of the queue. The latter function can return a NULL pOllltcr if

Figure 5.11 Another Architecture

static WORD wSignals;
Ifdefine SIGNAL~A OxOOOl
Itdefi ne SIGNAL_B OxOOO2
itdefine S I GNAL_C OxOOO4
Itdefi ne SIGNAL_D OxOOO8 void interrupt vHandleDeviceA (voidf {

II Reset device A wSignals I~ SIGNAL A;

void interrupt vHandleDeviceB (void) {

!I Reset device B wSignals I~ SIGNAL_B;

(continued]

PnOBl.EA'lS

135

Figure 5.11 (continued}

void main (void) [

WORD wHighestPriorityFcn;

while (TRUE)

/* Wait for something to happen */ while (wSignals ~~ 0)

/* Find highest priority follow-up processing to do */ wHighestPriorityFcn ~ SIGNAL_A;

disable ();

/* If one signal is not set ... */

while ( (wSignals & wHighestPriorityFcn) 0)

/* . . . go to the next */ wHighestPriorityFcn «- 1;

/* Reset this signal; we're about to service it. */ wSignals &~ -wHighestPriorityFcn;

enable ();

/* Now do one of the functions. */ switch (wHighestPriorityFcn)

(

case SIGNAL_A:

II Handle actions required by device A break;

case SIGNAL B:

II Handle actions required by device B break;

136

SU'VEY Of SOFTWARE ARCHITECTURES

the queue is empty. Be sure to disable interrupts around any critical sections in your code.

5. Enhance your code from Problem 4 to allow functions to be prioritized. The function that adds a pointer to the queue should take a priority parameter. Since this function is likely to be called from interrupt routines. make sure that it runs reasonably quickly. The fiinction that reads items from the queue should return them in priority order.

to

ReaI-Tilne Operating Systems

In this chapter and the next, we'll expand on the last chapter's discussion of the real-time-opcratiug-svstem architecture. We 'll look at the services offered by a typical real-time operating system and start to consider how to usc them constructively. As you read this chapter and the next, you might want to examine the sample code and the /.ie/os real-time operating system on the CD that accompanies this book. This code is explained fully in Chapter 11.

You may remember the caveat stated at the beginning of this book: embedded systems is a field in which the terminology is inconsistent. Never is this more true than when we discuss real-time operating systems. Many people use the acronym RTOS (which they pronounce "are toss"). Others use the terms kernel. real-time kernel, or the acronym for this, RTK. Some use all of these terms synonymously; others use kernel to mean some subcollection containing the most basic services offered by the larger RTOS. These latter people consider things like network support software, debugging tools, and perhaps even memory management to be part of the RTOS but not part of the kernel. Since there is no general agreement about where the .kernel stops and the RTOS begins." this book will ignore these distinctions and use the term RTOS indiscriminately.

Despite the similar name, most real-time operating systems are rather different from desktop machine operating systems such as Windows or Unix. In

1. This distinction is oftcn made bv peopie who sell this software, because they sell the kernel separately from the other features. When you're dealing with them, you have to understand their language and know what you're buying,

138

--------~~--:~~

IN nWDlTCTION TO REAL-TIME OPERATING SYSTEMS

the first place, qn a desktop computerthe operating system takes control of the machine as soon as it is turned on and then lets you start your applications. You compile and link your applications separately from the operating system. In an embedded system, you usually link your application and the RTOS. At boot-up time, your application usually gets control first, and it then starts the RTOS. Thus, the application and the RTOS are much more tightly tied to one another than are an application and -its desktop operating system. We'll see the ramifications of this later.

In the second place, many RTOSs do not protect themselves as carefully from your application as do desktop operating systems. For example, whereas most desktop operating systems check that any pointer you pass into a system function is valid, many RTOSs skip this step in the interest of better performance. Of course, if the application is doing something like passing a bad pointer into the RTOS. the application is probably about to crash anyway; for many embedded systems, it may not matter if the application takes the RTOS down with it: the whole system will have to be rebooted anyway.

In the third place, to save memory RTOSs typically include just the services that you need for your embedded system and no more. Most RTOSs allow you to configure them extensively before you link them to the application, letting you leave out any functions you don't plan to use. Unless you need them, you can configure away such common operating system functions as file managers, I/O drivers, utilities, and perhaps even memory management.

You can write your own RTOS, but you can-e-and probably should-ebuy one from one of the numerous vendors that sell them. Available today are VxWorks, VRTX, pSOS, Nucleus, C Executive, Lynx OS, QNX, MultiTask!, AMX, and dozens more. Others will no doubt come .to market. Although there are special situations in which writing your own RTOS might make sense, they are few and far between. Unless your requirements for speed or code size or robustness are extreme, the commercial RTOSs represent a good value, in that they come already debugged and with a good collection of features and tools. This was not so true in the past, when the RTOS vendors offered less-sophisticated products, but the commercial RTOSs available today easily satisfy the requirements of the overwhelming majority of systems.

It is beyond the scope of this book to offer advice about which RTOS you should choose. In many ways the systems are very similar to one another: they offer most or all of the services discussed in this chapter and the next, they each support various microprocessors, and so on. Some of them even conform to the POSIX standard, a standard for operating system interfaces proposed by

139

o.r TASKS AND TASK STATES

the Institute of Electri~al and Electronic Engineers. \Ve leave to the salesmen from the various vendors to explain why their systems run taster than those of their competitors, use less memory, have a better application programming interface. have better debugging tools, support more processors, have more already-debugged network drivers for usc with then: systems, and so on.

In this chapter we'll discuss the concept of a task 1ll an RTOS environment, we'll revisit the shared data problem, and we'll discuss semaphores, a method for dealing with shared data under an RTOS.

6 .. 1 Tasks and Task States

The basic building block of software writr.:n under an RTOS the task

are very simple to write: under most RTOSs a task is simply subroutine. lI.t some point in your program, you make one or more calls to a function in the RTOS that stares tasks, telling it which subroutine is the starting point for each task and some other parameters that we'll discuss later, such as the task's priority, where the RTOS should find memory for the task's stack, and so on. Most RTOSs allow you to have as many tasks as you could reasonably want.

Each task in an RTOS is always in one of three states:

1. Running-e-whir.h means that the microprocessor is executing the instructions that make up this task. Unless yours is a multiprocessor system, there is only one microprocessor, and hence only one task that is in the running state at any given time.

2. Ready-which means that some other task is in the running state but that this task has things that it could do if the microprocessor becomes available. Any number of tasks can be in this state.

3. Blocked-which means that this task hasn't got anything to do right now, even if the microprocessor becomes available. Tasks get into this state because they are waiting for some external event. For example, a task that handles data coming in from a network win have nothing to do when there is no data. lI. task that responds to the user when he presses a button has nothing to do until the user presses the button. lI.ny number of tasks can be in this state as well.

2. IEEE standard number 1003.4,

140 L<TRODUC:TION TO REAL-TIME OPER.HIl'G SYS-TE-.M-S-------------

Most RTOSs seem to proffer a double handful of other task states. Included among the offerings are suspended, pended, waiting, dormant, and delayed. These usually just amount to fine distinctions among various subcategories of .rhc blocked and ready states listed earlier.3 In this book, we'll lump all task states into running, ready, and blocked. You can find out how these three states correspond with those of your RTOS by reading the manual that comes with it.

The Scheduler

A part of'the RTOS called the scheduler keeps track of the state of each task and decides which one. task should go into the running state. Unlike the scheduler in Unix or Windows, the schedulers in most RTOSs are entirely simpleminded about which task should get the processor: they look at priorities you assign to the tasks, and among the tasks that are not in the blocked state, the one with the highest priority runs, and the rest of them wait in the ready state. The scheduler will not fiddle with task priorities: if a high-priority task hogs the microprocessor for a long time while lower-priority tasks are waiting in the ready state, that's too had. The lower-priority tasks just have to wait; the scheduler assumes that YOll knew what you were doing when you set the task priorities.

Figure 6.1 shows the transitions among the three task states. In this book, we'll adopt the fairly common use of the verb block to mean "move into the blocked state," the verb run to m.ean "move into the. running state" Of "be in the running state," and the verb switch to mean "change which task is in the running state." The figure is self-explanatory, hut there arc a few consequences:

A task will only block became it decides for itself that it has run out of things to do. Other tasks in the system or the scheduler cannot decide tor a task that it needs to wait for something. As a consequence of this, a task has to be running just before it is blocked: it has to execute the instructions that figure out that there's nothing more to do.

\Vhile a task is blocked, It never gets the microprocessor. Therefore, an interrupt routine or some other task in the system must be able to signal that whatever tne task was waiting for has happened. Otherwise, the task will be blocked forever.

R The shufRing of tasks between the ready and running states is entirely the work of the scheduler. Tasks can block themselves, and tasks and interrupt routines can

3.' These distinctions arllong these other states are sometimes important t.o the engineers who wrote the 17....TOS_(anJ perhaps to the marketers who are selling it.iwho want us to know how much we're getting for our 1110ney), but they are usually not important to the user.

6.[ TASKS AND TAsK STATES 141

Figure 6.1 Task States

Whatever the task needs, happens.

Task needs something

to happen before it can continue.

This is highest priority ready task.

Another ready task is higher priority

move other tasks from the blocked state to the ready state, but the scheduler has control over the running state. (Of course, if a task is moved from the blocked to the ready state and has higher priority than the task that is running, the scheduler will move it to the running state immediately. We can argue about whether the task was ever really in the ready state at all, but this is a semantic argument. The reality is that some part of the application had to do something to the task-movc it out of the blocked state-and then the scheduler had to make a decision.)

Here arc answers to some common questions about the scheduler and task states:

How does the scheduler know when a task has become blocked or unblocked? The RTOS provides a collection of functions that tasks can call to tell the scheduler what events they want to wait for and to signal that events have happened. We'll be discussing these functions in the rest of this chapter and in the next chapter.

H'I1at happens if all the tasks are blocked? If all the tasks are blocked, then the scheduler will spin in some tight loop somewhere inside of the RTOS, waiting for something to happen. If nothing ever happens, then that's your fault. You must make sure that something happens sooner or later by having an interrupt routine that calls some RTOS function that unblocks a task. Otherwise, your software will not be doing very much.

f4-1!l1t if two tasks with the same priority are ready? The answer to this is all over the map, depending upon which RTOS you use. At least one system solves this problem by making it illegal to have two tasks with the same priority. Some

142

INTRODCCT!()N TO REA f.-TIME OPERATING SYSTEMS

Figure 6.2 US", for Tasks

/* "Button Task" */ void vButtonTask (void)

/* High priority */

whi 1 e (TRUE) r

11 Block until user pushes a button 11 Quick: respond to the user

/* "Levels Task" */ void vLevelsTask (void)

/* Low priority */

whi 1 e (TRUE) {

11 Read levels of floats in tank

fl Calculate average float level

other KrOSs will time-slice between two such tasks. Some will run one of them until it blocks and then run the other. In this last case, which of the two tasks it runs also depends upon the particular RTOS, In Chapter 8, we'll discuss whether you should have more than one task with the same priority anyway.

If one tasl: IS IWll1Illf? and another, higher-priority task unblocks, doe; the task that is rUlwill,R get stopped and moved to the ready state r(ght muay? A preemptive RTOS will stop a lower-priority task ,IS soon as the higher-priority task unblocks. A nonpreemptive RTOS will only take the microprocessor away from the lowcr-priorirv task when that task blocks. In this book, we WIll assume that the RTC)S IS preemptIve (and m fact we already did so in the last chapter when we discussed the ch.uar.tcristic- of ](IOS response). Nonpreemptive RTOSs have charactcrisric, very different from preemptive ones. See the problems at the end of this chapter for 1110re thoughts a bout nonpreemptive RTOSs.

A Simple Example

Figure 6.2 is the classic situation in-;'v}jichan RIDS can make a difficult system Cd';y to build. This pseudo-code is from the underground tank morn tonng system." Here, the v Leve 1 k task uses up a lot of computing time flguring

4. Real code for this is in figures 11.4 and 11.8.

--~------- -----.

Figure 6.2 (continued]

---_.--_ .. _--

6.1 TASKS AND TASK STATES

143

il Do some interminable calculation ff Do more interminable calculation

ii Do yet more interminable calculation

11 Figure out whi~h tank to do next

out how much gasoline is in the tanks, and in fact will use up as much computing time as it can get. However, as 50011 as the user pushes a button, the vButtonTa 5 k task unblocks. The RTOS will stop the low-priority vLevel s Ta s k task in its tracks, move it to the ready state, and run the high-priority vButtonTask task to let It respond to the user. When the v8uttonTask task is finished responding, it blocks, and the RTOS gives the microprocessor back to the vLevel sTask task once again. (See Figure 6.3.)

Figure 6.3 Microprocessor Responds to a Button under an RTOS

vLevelsTask is busy calculating while vButtonTask is blocked,

User presses button; RTOS switches microprocessor to vButtonTask; vLevelsTask

is ready.

\

vButtonTask I

vLevel s Te s k ,,} ..c.~ --'

vButtonTask does everything it needs to do to respond to the button.

vButtonTask finishes its work and blocks again; RTOS switches microprocessor back to

vLevel s r as k.

//

lr---r-----===:::J

Time ------------------------------------------~~-------.

The microprocessor's attention switches from task to task in response to the buttons.

144

6 .. 2

[NTUODUCTWN TO REAL- TIME OPEHATI'iG SYSTEMS

Figure 6.4 RTOS Initialization Code

void main (void)

/* Initialize (but do not start) the RTDS */ InitRTOS ();

/* Tell tile RTGS about our tasks */

StartTask (vRespondToButton. HIGH_PRIORiTY); StartTask (vCalculateTankLevels. LOW_PRIORITY);

/* Start the RTDS. (This function never returris.l */ StartRTOS ();

~----- .. ------------

One convenient feature of the RTOS is that the two tasks can be written independently of one another, and the system will still respond well. Whoever writes the code to do the calculating can write it without worrying about how fast the system has to respond to button presses. The RTOS will make the Iesponse good whenever the user presses a button by turning the microprocessor over to the task that responds to the buttons immediately.

Obviouslv, to make tins work, there must be code somewhere that tells the KIOS that each of the subroutines is a task and that the calculation task has a lower pr iorirv than the button task. Code like that in Figure 6.4 might do the job. Note rh.it this l' the ma in function, where the application will start, and it is the of tim code to start the RTOS. It is fairly common to have one RTOS function that initializes the RTOS data structures, InitRTOS in this example, and another function that really starts the RTOS running, Sta rtRTOS in this example. The S La rtRTOS function never returns; after it is called, the RTOS scheduler runs the various different tasks.

Tasks and Data

Each task has its own private context, which includes the register values, a program counter, and a stack. However, all other data-global. static, initialized, uninitialized, and everything else-v-is shared among all of the tasks in the system. As shown in Figure 6.5, task 1, Task 2, and Task 3 can access any of the data

DATA

145

6.2 TASKS

Figure 6.5 Data in an RTOS-·Based Real-Time System

i··----l

RTOS I _-~

data i~--_[-~-J

i structures I -----~

I

Task 2 stack Task 2 registers

Task 3 registers Task 3 stack

in the system. (If you're familiar with Windows or Unix, you can see that tasks in an RTOS are marc like threads than like processes.)"

The RTOS typically has its own private data structures, which are not available to any of the tasks.

Since you can share data variables among tasks, it is easy to move data from one task to another: the two tasks need only have access to the same variables. You can easily accomplish this by having the two tasks in the same module in which the variables are declared, or you can make the variables public in one of the tasks and declare them extern in the other. Figure 6.6 shows how the former

5 . There are now a few commercial RTOSs available in which each task has a separate data area, more like a process, but these are still 111 the minoriry,

146

--.,,---~

INTRODUCTION TO REAL-TIME OPERATING SYSTEMS

figure 6.6 Sharing Data among RTOS Tasks

struct

long 1 TankLevel ; long lTimeUpdated~ tankdata[MAX_TANKS];

/* "Button Task" */

void vRespondToButton (void) /# High priority */ {

i nt i;

while (TRUE) {

II Block until user pushes a button i-II ID of button presseq;

printf ("\nTIME: %081d. LEVEL: IOald", tankdata[i].lTimeUpdated, tankdata[i].lTankLevel);

/* "Levels Task" */

void vCalculateTankLevels (void) {

/* Low priority */

i nt i - 0; while (TRUE)

II Read levels Of floats i~ tank i II Do more interminable ~a)cuJation

II Do yet more interminable calculation

1* Store the result *1

tankdata[i].lTimeUpdated - II Current time

/* Between these two instructions is J

bad place for a task switch *1 tankdata[i].qankLevel = II Result of calculation

! I Figure out which tank to do next i-II something new

6.2 TASKS AND DATA

147

might be accomplished. This is the same program as the one in Figure 6.2, only fleshed out with xorne detail. Now we see that the vRespondToButton task prints out some data that is maintained by the vCa 1 cu 1 ate Ta n k l.ev e 1 s task. Both tasks can access the tankData array of structures just as they could if this system were written without an RTOS. The normal rules of C apply to variable scope.

Shared-Data Problems

Unfortunately, there is a bug in the code in figure 6.6. Figure out what it is before you read on.

If you have a sinking sense of deja vu, there's a reason. In Chapter 4, we looked at several examples in which bugs cropped up because an interrupt routine shared data with task code in the system. Here we have two tasks sharing data, and unfortunately all of the same kinds of bugs vrc looked at before can come right back to haunt us. The RTOS might stop vCalculateTankLevels at any time and run vRespondToButton. Remember, that's what we want the RTOS to do, so as to get good response. However, the RTOS might stop vCalcul ateTankLevels right: ill the middle of setting data in the tankdata nray (which IS not all atomic operation), and v Res pondToButton might then read that halt- changed data.

In the next section we'll discuss some tools in the RIOS that help us fix this problem, but before we look at the solution, let's look at some of the subtle manifestations of this problem. Figure 6.7 shows another example. In it, both Taskl and Task2 call vCountErr-ors. Thisis a perfectly valid thing to do 111 an RTOS: any or all of the tasks can share as many subroutines as is convenient. But Figure 6.7 has a potential bug in it. Examine the figure and see if you can see what the problem is;

The difficulty with the program in Figure 6.7 is that because both Ta s kl and Task2 call vCountErrors, and since vCountErrors uses the variable cErrors, the variable cErrors is now shared by the two tasks (and again used in a nonatomic way). IfTaskl calls vCountErrors, and if the RTOS then stops Taskl and runs

~ Task2, which then calls vCountErrors, the variable cf rr o r s may get corrupted in Just the same way as it would if Ta s k2 were an interrupt routine that had interrupted Ta s k l.

If it is unclear to you why this.code in Figure 6.7 fails, examine Figure 6.8.

The assembly code for vCountErrorsis at the top of that figure: below it is a potential sequence of events that causes a bug. Suppose that the value 5 is stored in cErrors. Suppose that Taskl calls vCountErrors(9), and suppose that vCountErrors does the MOVE and ADD mstructions, leaving the result. ill register

148 INTRODUCTION TO REAL- TIME OPERATING SYSTEMS

Figure 6.7 Tasks Can Share Code void Taskl (void)

vCountErrors (9);

void Task2 (void)

vCountErrors (11);

static int cErrors;

void vCountErrors (int cNewErrors) {

cErrors +~ cNewErrors;

RI. Suppose now that the RTOS stops Tas kl and runs Ta s k2 and that Task2 calls vCountErrors(ll). The code in vCountErrors fetches the old value of ct rr cr s , adds 11 to it, and stores the result. Eventually, the RTOS switches back to Ta s k l , which then executes the next instruction in vCountErrors, saving whatever is in register Rl to eEITors and overwriting the value written by Task2. Instead of cErrors ending up as 25 (the original 5, plus 11 plus 9), as it should, it ends up as 14. Note that the RTOS can be counted upon to save the value in register Rl for Ta s kl while Ta s k2 is running and to restore it later when Ta s kl resumes.

Reentrancy

People sometimes characterize the problem in Figure 6.7 by saying that the shared function vCountErrorsis not reentrant. Reentrant functions arc functions that can be called by more than one task and that will always work correctly.

6.2 T AS.f'S AND DATA

149

Figure 6.8 Why the Code in Figure 6.7 Fails

Assembly code for vCountErrors

void vCountErrors (int cNewErrors)

;(

cErrors ~= cNewErrors; MOVE Rl. (cErrors)

ADD RI. (cNewErrors) Move (cErrors). Rl RETURN

;}

Tirne

R 1 torTask l R1 for Task Z cErrors

Task1 calls vCountErrors (9) MOVE Rl, (cErrors)

ADD Rl. (cN~wErrors)

_____ -5

---

5----------

RTOS switches to Task 2

Task2 calls vCountErrors (11) MOVE RI. (cErrors)

ADD RI, (cNewErrors) MOVE (cErrors), Rl

16~ 16

RTOS switches to Task1 MOVE (cErrors), Rl

-_

---14

even if the RTOS switches from one task to another in the middle of executing the function. The function vCountErrors does not qualify,

You apply three rules to decide if a function is reentrant:

1. A reentrant function may 110t use variables in a nonatomic way unless they are stored on the stack of the task that called the function or are otherwise the private variables of that task.

2. A reentrant function may not call any other functions that are not themselves reentrant.

3. A reentrant function may not usc t.he hardware in a nonatomic way.

Figure 6.9 Variable Storage

static int static _int; int publicint;

int initialized - 4;

char *string - "Where does this string go?"; void *vPointer;

void function (int parm, int *parm_ptr) {

static int static_loca1; int local;

A Review of C Variable Storage

To better understand reentrancy, and in particular rule 1 above, you must first understand where the C compiler will store variables. If you are a C language guru, you can skip the following discussion of where variables are stored in memory. If not, review your knowledge of Cby examining Figure 6.9 and answering these questions: Which of the variables in Figure 6.9 are stored on the stack and which in a fixed location in memory? What about the string literal "Where does thi s stri ng go?" What about the data pointed to by vP01 nter? By parmptr?

Here are the answers:

• stat i c_ i nt--is in a fixed location in memory and is therefore shared by any task that happens to call funct ion.

public.int---Ditto. The only diffi:rence between static_int and public_int is that functions 111 other C files can access publ i c __ i nt, but they cannot access stati c_i nt. (This means, of course, that it is even harder to be sure that this variable is not used by multiple tasks, since it might be used by any function in any module anywhere in the system.)".

6. Of course, . if you want, you write code that pas:les the address of s ta t i c i nt to

some function in another C file, and then that function could usc stati c_i nt. After that, s t at t cTnt would be as big a problem as public_int

151

I i nit i ali zed---- The same, The initial value makes no difference to where the

variable is stored.

I string--The same.

I "Where does this string go?"-Also the same.

I vPoi nter- The pointeritself is in a fixed location in memory and is therefore a shared variable. If functi on uses or changes the data values pointed to by vPoi nter, then those data values are also shared among any tasks that happen to call function.

I parm-is on the stack." If more than one task calls function, parm will be in a different location for each, because each task has its own stack. No matter how many tasks call function, the variable parm will not be a problem.

• pa rm_pt r-is on the stack. Therefore, funct i on can do anything to the value of pa rmptr without causing trouble. However, if functi on uses or changes the values of whatever is pointed to by parm_ptr, then we have to ask where that data is stored before we know whether we have a problem. We can't answer that question just by looking at the code in Figure 6.9. If we look at the code that calls funct i on and can be sure that every task will pass a different value for pa rm_ pt r, then all is well. If two tasks might pass in the same value for pa rm_ptr, then there might be trouble.

• s tat i c_l oc a l-is in a fixed location in memory. The only differencc between this and static_int is that s t at t c Tnt can be used by other functions in the same C file, whereas static_local can onlybe used by function.

I 1 oca i-is on the stack.

Applying the Reentrancy Rules

Whether or not you are a C language guru, examine the function di sp1 ay in Figure 6.10 and decide if It is reentrant and why it is or isn't.

This function is not reentrant, for two reasons. First, the variable fError is in a fixed location in memory and is therefore shared by any task that calls display. The use off Error is not atomic, because the RTOS might switch

7. Be forewarned that there is at least one compiler out there that would put pa rm, pa rm_pt r, and 1 oca 1 in fixed locations. This compiler is not in compliance with any C standard--but it produces code for an 8051, an 8-bit microc ontroller. The ability to write in C for this tiny machine is worth .some compromises.

152 INTROD;;-C~10N TO REAL-TIME OPEIlATlNG SYSTEMS

Figure 6.10 Another Reentrancy Example

BOOL fError;

/* Someone else sets this */

void display (int j)

if (!fError) {

printf ("'nValue: %d", j); j - 0;

fError - TRUE;

else {

printf ("'nCOUld not display ~alue"): fError - FALSE;

tasks between the time that it is tested and the time that it is set. This function therefore violates rule 1. Note that the variable j is no problem; it's on the stack.

The second problem is that this function may violate rule 2 as well. For this function to be reentrant, pr t ntf must also be reentrant. Is prj ntf reentrant? Well, it might be, but don't count on it unless you have looked in the manual that comes with the compiler you are using and seen an explicit statement that it is.

Gray Areas of Reentrancy

There are some gray areas between reentrant and nonreentrant functions. The code here shows a very simple function in the gray area.

static int cErrors;

void vCountErrors (void)

++cErrors;

This function obviously modifies a nonstack variable, but rule 1 says that a reentrant function may not use nons tack variables in a nonatomic way. The question is: is incrementingcf r r o r s atomic?

6.3

6.3 SEMAPHORES AND SHARED DATA 153

As with a number of the shared-data problems that we discussed in Chapter 4, we can answer this question only with a definite "maybe," because the answer depends upon the microprocessor and the compiler that you are using. If you're using an 8051, an 8-bit microcontroller, then ++cErrors is likely to compile into assembly code something like this:

MOV MOVX INC MOVX JNZ MOV MOVX MOVX

DPTR,#cErrors+OlH A.@DPTR

A

@DPTR,A

noCarry

DPTR.ft cErrors A.@DPTR @DPTR,A

noCarry:

RET

which doesn't look very atomic and indeed isn't anywhere close to atomic, since It takes nine instructions to do the real work, and an interrupt (and consequent task switch) might occur anywhere among them.

But if you're using an Intel 80x86, you might get:

INC (cErrors)

RET

which is atomic.

Tfyou really need the performance of the one-instruction function and you're using an 80x86 and you put in lots of comments, perhaps you can get away with writing vCountErrors this way. However, there's no way to know that it will work with the next version of the compiler or with some other microprocessor to which you later have to port it. Writing vCountErrors this way is a way to put a little land mine in your system, just waiting to explode. Therefore, if you need vCountErrors to be reentrant, you should use one of the techniques discussed in the rest of this book.

Semaphores and Shared Data

In the last section, we discussed how the RTOS can cause a new class of shared-data problems by switching the microprocessor from task to task and, like interrupts, changing the flow of execution. The RTOS, however, also gives you some new tools with which to deal with this problem. Semaphores are one such tool.

154' INTRODUCTION TO REAL-TIME OPERATING SYSTEMS

Figure 6.11 Semaphores

Back in the bad old days, the railroad barons discovered that it was bad for business if their trams ran into one another. Their solution to this problem was to Lise signals called "semaphores." Examine Figure 6.11. When the first train enters the pictured section of track, the semaphore behind it automatically lowers. When a second tram arrives, the engineer notes the lowered semaphore, and he stops his tram and waits for the semaphore to rise. When the first train leaves that section of track, tlw semaphore rises. and the engineer on the second

tram knows that It safe to on. There is no possibility of the second

train runmngmto the first one. The idea of J semaphore in an RTOS

is similar to the [dea of a railroad semaphore.

Trains do two things with semaphores. FIrSt, when a train leaves the protected section of track, it raise, the semaphore. Second, when a train comes to a semaphore, it w.ut- for the semaphore to rise, if necessary, passes through the (now raised) semaphore, and lowers the semaphore. The typical semaphore III an RTOS works much the same way.

RTOS Semaphores

Although the word was originally coined for a particular concept, the word ,Pl'J'lI1JIFlOre is now one of the most slippery JIl theembedded-systems world. It

155

~~~~---------_

6·3 SEMAPHORES AND SHARED DATA

seems to mean almost as many different things as there are software engineers, or at least as there are RTOSs. Some KrOSs even have more than one kind of semaphore. Also, no RTOS uses the terms raise and lower; they use get and give, take and release, pend and post, p and v, wait and signal, and any number of other combinations. We will use take (for lower) and release (for raise). We'll discuss first a kind of semaphore most commonly called a binary semaphore, which is the kind most similar to the railroad semaphore; we'll mention a few variations below.

A typical RTOS binary semaphore works like this: tasks can call two RTOS functions, TakeSemaphore and ReleaseSemaphore. If one task has called TakeSemaphore to take the semaphore and has not called Rel easeSemaphore to release it, then any other task that calls Ta keSemaphore will block until the first task calls Re I easeSemaphore. Only one task can have the semaphore at a time.

The typical use for a semaphore is to solve the sort of problem that we saw in Figure 6.6. Figure 6.12 illustrates how to do this.

Figure 6.12 Semaphores Protect Data

struct

long lTankLevel; long lTimeUpdated; tankdata[MAX_TANKS];

/* "Button Task" */

void vRespondToButton (void) /* High priority */ (

int i;

while (TRUE)

II Block until user pushes a button i-II Get 1D of button pressed TakeSemaphore ();

printf ("\nTIME: %08ld LEVEL: %08ld",

tankdata[i].lTimeUpdated,

tankdata[i J.lTankLevel); ReleaseSemaphore ();

156 INTRODUCTiON TO REAL-TIME OPERATING SYSTEMS

figure 6.12 (continued]

/* "Levels Task" *1

void vCalculateTankLevels (void) [

/* Low priority *1

int i ~ 0; while (TRUE) {

TakeSemaphore ();

!! Set tankdata[i]. I nmeUpdated !! Set tankdata[i]. I TankLeve 7 ReleaseSemaphore (f:

Before the "levels task" (vCalculateTankLevels) updates the data in the structure, it calls TakeSemaphore to take (lower) the semaphore. If the user presses a button while the levels task is still modifying the data and still has the semaphore, then the following sequence of events occurs:

1. The RTOS will switch to the "button task," just as before, moving the levels task to the ready state.

2. When the button task tries to get the semaphore by calling TakeSemaphore, it will block became the levels task already has the semaphore.

3. The RTOS will then look around for another task to run and will notice that the levels task is still ready. With the button task blocked, the levels task will get to run until it releases the semaphore.

4. When the levels task releases the semaphore bycalling ReleaseSemaphore, the button task will no longer be blocked, and the RTOS will switch back to it.

The sequence of C instructions in each task that the system executes in this case IS shown in Figure 6.13.

The result of this sequence is that the levels task can always finish modifying the data before the button task can use it. There is no chance of the button task reading half-changed data.

I I I

I I

11 Set tankdata[i].lTimeUpdated I

I

~ The user pushe: a button; the

higher-priority button task ~

unblocks; the RrOS swiehes tasks. \

I I

I TakeSemaphore ():

I (T11is does not return yet)

The semaphore:iS not available: the )

(button task blocks; the RTOS ~

switches back.

I I

11 Set tankdata[i], lTankLevel I

Rel easeSemaphore (): I

~ Releasing the s~maPhore unblocks

[he button task; the RTOS ~

switches again. \

I I I I

I 11 Block until user

I

Th, button ,"k: blocks; the RTOS) (~ resumes the levels task.

Figure 6.13 Execution Flow with Semaphores

Code in the veal cul ateTankLevel s task.

Levels task is calculating tank levels.

TakeSemaphore ():

6.3 SEMAPHORES AND SHARED DATA 157

Code in the vRespondToButton task.

Button task is hlockcd waiting for a button.

i = 11 Get ID of button

(Now TakeSemaphore returns) pri ntf ( . . .):

Rel easeSemaphore ():

pushes a button

15 8 I~T RODUCTION TO RF.AL- TIME OPERATING SYSTEMS

Figure 6.14 is the nuclear reactor system, this time with a task rather than an interrupt routine reading the temperatures. The functions and data structures whose names begin with "OS" are those used in ~C/OS. The OSSemPost and OSSemPend functions raise and lower the semaphore. The OSSemCreate function initializes the semaphore, and it must be called before either of the other two. The OS_EVENT structure stores the data that represents the semaphore, and it is entirely managed by the RTOS. The WAIT FOREVER parameter to the OSSemPend function indicates that the task making the call is willing to wait forever for the semaphore; we will discuss this concept further in Chapter 7. The OSTimeDly function cau-cs vReadTempe rat u reTas k to block for approximately a quarter ofa second; the event that unblocks it is simply the expiration of that amount of time. Therefore, this task wakes up, reads in the two temperatures, and places them in the array once every quarter ofa second. In the meantime, vControlTask checks continuously that the two temperatures are equal.

The calls to OSSemPend and OSSemPost in this code fix the shared-data problems we have discussed in the past in conjunction with this example. One possible subtle bug nonetheless is hiding in the code in Figure 6.14. Do you sec it?

Initializing Semaphores

The bug arises with the call to OSSemCreate, which must happen before vReadTemper-e t ur-e Ta sk calls OSSemPend to use the semaphore. How do you know that this really happens? You don't.

Now you might argue that since vReadTemperatureTask calls OSTi meDly at the beginning before calling OSSemPend, vControl Task should have enough time

Figure 6.~4 Semaphores Protect Data in the Nuclear Reactor

#define TASK_PRIORITY_READ 11 /tdefi ne TASK PRIORITY. CONTROL 12 /tdefine STK_SIZE 1024

static uhsigned int ReadStk [STK_SIZE]; static ~nsigned int ControlStk [STK_SIZE];

staiic int iTemperatures[2]; OS_EVENT *p semTemp;

(continued)

6.3 SEMAPHORES AND SHARED DATA 159

Figure 6.14 (continued}

void main (void) (

1* Initiaiize (but do not start) the RTOS */ OSInit ();

/* Tell the RTOS about our tasks */ OSTaskCreate (vReadTemperatureTask, NULLP,

(void *)&ReadStk[STK_SlZEl. TASK. PRIORITY_READ); OSTaskCreate (vControlTask. NULLP,

( vo i d *) &Contro'l St k[ STK_S I ZEl. TASK_PRIORITY _CONTRO l) ;

1* Start the RTOS. (This function never returns.) */

OSStart ();

void vReadTemperatureTask (void)

while (TRUE) {

DSTimeDly (5); /* Delay about 1/4 second *1

OSSemPend Cp_semTemp, WAIT_FOREVER); II read in iTemperatures[O];

II read in iTemperatures[l]:

OSSemPost (p_semTemp);

void vControlTask (void)

p_semTemp - OSSemInit (1); while (TRUE)

OSSemPend (p_semTemp, WA IT FOREVER) ;

if (iTemperatures[Ol !- iTemperatures[l]) 11 Set off howling alarm;

OSSemPost (p_semTemp);

11 Do other useful work

160

INTRODu crio x TO REAL-TIME OPERATING SYSTEMS

to call OSSemCreate. Yes, YOLl might argue that, hut if you write embedded code that relies on that kind of thing, you will chase mysterious bugs for the rest of your career. How do you know that there isn't-v-or won't be some dav-e-sorne higher-prionty task that takes up all of the delay time in v ReadTempe ratureTas k?

Alternatively, you might argne that you can make it work for sure by giving vControlTask a higher priority than vReadTemperatureTask. Yes, that's true, too ... until some compelling (and probably more valid) reason comes up to make vReadTemperatureTask a higher priority than vControl Task and someone makes the change without realizing that you put this time bomb into the code.

Don't fool around. Put the semaphore initialization call to OSSemCreate in some start-up code that's guaranteed to run first. The ma i n function shown in Figure 6.14, somewhere before the call to OSSta rt, would be a good place for the call to OSSemlni t.

Reentrancy and Semaphores

In Figure 6.15, we revisit the shared function vCountErrors, which back in Figure 6.7 was not reentrant. In Figure 6.15, however, the code that modifies the static variable c Err 0 r s is surrounded by calls to semaphore routines. In the language of data sharing, we have protected cErrors with a semaphore. Whichever task calls vCountErrors second will be blocked when it tries to take the semaphore. In the language ofreentrancy, we have made the use of cErrors atomic (not in the sense that it cannot be interrupted, but in the sense that it cannot be interrupted by anything we care about, that is, by anything that uses the shared variable) and therefore have made the function vCountErrors reentrant. The functions and data structures whose names begin with "NU" are those used in an RTOS called Nucleus8 The NU_SUSPEND parameter to the NU __ Obtain_Semaphore function is like the WAITJOREVER parameter in Figure 6.14.

You might ask: "Would the code in Figure 6.15 still work if the calls to NU_Obtain_Semaphore and NU_Release__:Semaphore were around the calls to vCountErrors instead of being within the function itself?" Yes. However, that would not be a very smart way to write the program, because you would have to remember to take and release the semaphore around every call to the function. By having the semaphore calls inside ofvCountErrors, it makes it impossible to forget.

8. Nucleus is a trademark ofAccelerated Technology Incorporated.

161

6·3 SEMAPHORES AND SHARED DATA

Figure 6.15 Semaphores Make a Function Reentrant void Taskl (void)

vCountErrors (9);

void Task2 (void)

vCountErrors (11);

static int cErrors;

static NU_SEMAPHORE semErrors;

voi~ vCountErrors (int cNewErrors) {

NU_Obtain_Semaphore (&SemErrors, NU_SUSPEND); cErrors +- cNewErrors;

NU_Release_Semaphore (&semErrors);

Multiple Semaphores

In Figure 6.14 and Figure 6.15, you'll notice that the semaphore functions all take a parameter that identifies the semaphore that is being initialized, lowered, or raised. Since most RTOSs allow you to have as many semaphores as you like, each call to the RTOS must identify the semaphore on which to operate. The semaphores are all independent of one another: if one task takes semaphore A, another task can take semaphore B without blocking. Similarly, if one task is waiting for semaphore C, that task will still be blocked even if some other task releases semaphore D.

162

INTRODUCTION TO REAL-TIME OPERATING SYSTEMS

H!hat:~ the advant'l,Rc 01 having multiple semaphores? Whenever a task takes a semaphore, it is potentially slowing the response of any other task that needs the samc semaphore. In a system with only one semaphore, if the lowest-priority task takes thc semaphore to change data in a shared array of temperatures, the highest-priority task might block waiting for that semaphore, even ifthc highestpriority task wants to modify a count of the errors and couldn't care less about the temperatures. By having one semaphore protect the temperatures and a different semaphore protect the error count, you can build your system so the highest-priority task canmodify the error count even if the lowest-priority task has taken the semaphore protecting the temperatures. Different semaphores can correspond to different shared resources.

How does the R'l 'OS know which semaphore protects which data? It doesn't. If you are using multiple semaphores, it is up to' you to remember which semaphore corresponds to which data. A task that is modifying the error count must take the corresponding semaphore. You must decide what shared data each of your semaphores protects.

Semaphores as a Signaling Device

Another common use of semaphores is as a simple way to communicate from one task to another or from an interrupt routine to a task. For example, suppose that the task that formats printed reports builds those reports into a fixed memory buffer. Suppose also that the printer interrupts after each line, and that the printer interrupt rouune teeds the next line to the printer each time it interrupts. In such a system, after formatting one report into the fixed buffer, the task must wait until the interrupt routine bas finished printing that report before it can format the next report.

One way to accomplish this fairly easily is to have the task wait for a semaphore after It has formatted each report. The interrupt routine signals the task when the report has been fed to the printer by releasing the semaphore; when the task gets the semaphore and unblocks, it knows that it can format the next report. (See Figure 6.

Note that the code in Figure 6.16 initializes the semaphore as already taken.

Most RTOSs allow you to initialize semaphores ill this way. When the task formats the first report and tries to take the semaphore, it blocks. The interrupt

9. Set· also Figure 11.11; Figure 6.16 is a cut-down version of the code in the tank monitoring svstern discussed in Chapter 11.

163

6.3 SEMAPHORES AND SHARED DATA

Figure 6.16 Using a Semaphore as a Signaling Device 1* Place to construct report. *1

static char a_chPrint[lO][21];

1* Count of lines in report. *1 static int iLinesTotal;

1* Count of lines printed so far. *1 static int iLinesPrinted;

1* Semaphore to wait for report to finish. *1 static OS_EVENT *semPrinter;

void vPrinterTask(void) {

BYTE byError; lnt wMsg;

/* Place for an error return. */

1* Initialize the semaphore as already taken. */ semPrinter = OSSemlnit(O);

while (TRUE) {

1* Wait for a message telling what report to format, *1

wMsg - (int) OSQPend (QPrinterTask. WAIT_FOREVER. &byError);

II Format the report into a_chPrint iLinesTotal - II count of lines in the report

1* Print the first line of the report */ iLinesPrinted - 0;

vHardwarePrinterOutputLine (a_chPrint[iLinesPrinted++]);

1* Wait for print job to finish. *1

OSSemPend (semPrinter, WAIT_FOREVER. &byError);

(continued)

164 l~i;oDUCTION TO REAL-TIME OPERATING SYSTEMS

Figure 6.16 (continued)

void vPrinterlnterrupt (void)

if (i L i nesPri nted ~~ i l i nesTota 1 )

/* The report is done. Release the semaphore. */ OSSemPost (semPrinter);

else

/* Print the next line. */

vHardwarePrint.e,OutputLine (a_chPrint[iLinesPrinted++]);

routine will release the semaphore and thereby unblock the task when the report is printed.

Semaphore Problems

When first reading about semaphores, it is very tempting to conclude that they represent the solutions to all of our shared-data problems. This is not true. In fact, your systems will probably work better, the fewer times you have to use semaphores. The problem is that semaphores work only if you use them perfectly, and there are no guarantees that YOLl (or your coworkers) will do that. There are any number of tried-and-true ways to mess up with semaphores:

Fo~r;ettitlg to take the semaphore. Semaphores only work if every task that accesses the shared data, for read or for write, uses the semaphore. If anybody forgets, then the RTOS may switch away from the code that forgot to take the semaphore and cause an ugly shared-data bug.

Forgettin,\? to release the semaphore. If any task fails to release the semaphore, then every other task that ever uses the semaphore will sooner or later block waiting to take that semaphore and will be blocked forever.

Taking the wrong semaphore. If you are usmg multiple semaphores, then taking the wrong one is as bad as forgetting to take one.

Holding a semaphore for too long. Whenever one task takes a semaphore, every other task that subsequently wants that semapbore has to wait until the semaphore is released. If one task takes the semaphore and then holds it for too long, other tasks may miss re~l-time deadlines.

A particularly perverse instance of this problem can arise if the RTOS switches from a low-priority task (call it Task C) to a medium-priority task (call it Task B) 'after Task'C has taken a semaphore. A high-priority task

·--------c6.~3---;S~~;A~IORES A~N~D~S~·H~A~R~E-D D~A~T~A

165

Figure 6.17 Priority Inversion

Task A gets a 11lCSsage in its queue and unblocks; RTOS switches to Task A.

Task B gets a message in its queue and unblocks; RTOS switches to Task B.

Task A tries to take the semaphore that

Task C already has taken.

Task B goes on running

and funning and funning, never giving Task C a

chance to release the semaphore. Task A is blocked.

I

I

Task C takes a semaphore that it shares with Task A,

Task A

Task J3

Task C

Tirne

~ The task the microprocessor is e~ecutil1g I

(call it Task A) that wants the semaphore then has to wait until Task B gives up the microprocessor: Task C can't release the semaphore until it gets the microprocessor back. No matter how carefully you code Task C, Task B can prevent Task C from releasing the semaphore and can thereby hold up Task A indefinitely. (See hgure 6.17.) This problem is called priority inversion; some RTOSs resolve this problem with priority inherirance-e-they temporarily boost the priority of Task C to that of Task A whenever Task C holds the semaphore and T3']: A is waiting for It.

Cat<sing a deadly embrace, Figure 6,18 illustrates the problem called deadly embrace. The functions aj smrs v and aj smrl s in that figure are from an RTOS called AAIX.10 The function aj smrs v "reserves" a semaphore, and .the function

10. AMX is the trademark of Kadak Products, Ltd.

166

-_- ..•. _--_.- ,-----._---

1;::;:;;:Z;;;-~-Z:TlON TO REA~- TIME OPERATING SYSTEMS

Figure6.18 Deadly-Embrace Example

i nt a; int b;

AMXID hSemaphoreA; AMXID hSemaphoreB; void vTaskl (void) {

ajsmrsv (hSemap~lOreA, 0, 01 ,
. ,
ajsmrsY (hSemaphoreB, 0, 0) ;
a = b;
ajsmrls (h5emaphoreB);
ajsmrls (hSemaphoY'eA) ; void vTask2 (void) {

ajsmrsv (hSemaphoreB, 0, 0);
ajsmrsv (hSemaphoreA, D, 0) ;
b ~ a;
ajsmrls (hSemaphoreA) ;
ajsmrls (hSemaphoreB); a j srnr 1 5 "releases" the semaphore. The two additional parameters to a j smr s v are time-out and priority information and are not relevant here. In the code in Figure (d8 both Ta s k l and Ta s k2 operate on variables a and b after getting permission to use them by getting semaphores hSemaphoreA and h Semaphcr eb. Do you sec the problem?

Consider what happens if vTaskl calls ajsmrsv to get hSemaphoreA, but before it can call ajsmrsv to get hSemaphoreB, the RTOS stops it and runs v1ask2. The task vTask2 now calls a jsmr s v and gets hSemapnoreB. When vTask2 then calls ajsmrsv to get hSemaphoreA, it blocks, because another task (vTaskl) already has that semaphore. The RTOS Will now switch back to vTaskl, which now calls aj smrs v to get hSemaphoreB. Since v1 as k2 has hSemaphoreB, however, vTaskl now also blocks. There is no escape from this for either task, since both are now block.ed waiting for the semaphore that ·the other has.

Of course, deadly-embrace problems would be easy to find and fix if they always appeared on one page of code such as in Figure 6.18. However, deadly embrace-is just as deadly vTaskl takes the first semaphore and then calls a

--.-~----.------

167

6.3 SEMAPHORES AND SHARED DATA

subroutine that later takes a second one while vTa s k2 tak.es the second semaphore and then calls a subroutine that takes the first. In this case the problem will not be so obvious.

In summary, every use of semaphores is a bug waiting to happen. You use them when you have to and avoid them when you can. We'll discuss some ways to avoid semaphores in the next chapters.

Semaphore Variants

There are a number of different kinds of semaphores. Here is an overview of some of the more common variations:

Some systems otter semaphores that can be taken multiple times. Essentially, such semaphores are integers; taking them decrements the integer and releasing them increments the integer. If a task tries to take the semaphore when the integer is equal to zero, then the task will block. These semaphores are called counting semaphores. and they were the original type of semaphore.

Some systems offer semaphores that can be released only by the task that took them. These semaphores are useful for the shared-data problem,.but they cannot be used to communicate between two tasks. Such semaphores are sometimes called resource semaphores or resources.

Some R'TOxsotler one kind of semaphore that will automatically deal with the priority inversion problem and another that will not. The former kind of semaphore is commonly called a mutex semaphore ormutex. (Other RTOSs offer semaphores that they call mutexes but that do not deal with prionty inversion.)

If several tasks are waiting for a semaphore when it is released, systems vary as to which task gets to run. Some systems will tun the task that has been waiting longest; others will run the highest-priority task that is waiting for the semaphore. Some systems give you the choice.

Ways to Protect Shared Data

We have discussed two ways to protect shared data: disabling interrupts and using semaphores. There is a third way that deserves at least a mention: disabling task switches. Most RTOSs have two functions you can call, one to disable task switches and one to reenable them after they've been disabl.ed. As is easy to see, you can protect shared data from an inopportune task switch by disabling task switches while you are reading or writing th e shared data.

168

INTRODl: enol' TO REAL- TIMF OPERATING SYSTEMS

Here's a comparison of the three methods of protecting shared data:

1. Disabling interrupts is the most drastic in that it will affect the response times of all the interrupt routines and of all other tasks in the system, (If you disable interrupts, you also disable task switches, because the scheduler cannot get control of the microprocessor to switch.) On the other hand, disabling interrupts has two advantages, (1) It is the only method that works if your data is shared between YOLl[ task code and your interrupt routines, Interrupt routines are not allowed to take semaphores, as we will discuss in thenext chapter, and disabling task switches does not prevent interrupts. (2) It is fast, Most processors can disa ble or enable interrupts with a .single instruction; all of the RTOS functions are many instructions long, If a task's access to shared data lasts only a short period of time-incrementing a single variable, f()I' example-sometimes it is preferable to take the shorter hit on interrupt service response than to take the longer hit on task response that you get from using a semaphore or disabling task switches.

2. Taking semaphores is the .rnost targeted way to protect data, because it affects only those tasks that need to take the same semaphore, The response times of interrupt routines and of tasks that do not need the semaphore are unchanged, On the other hand, semaphores do take up a certain amount of microprocessor time+-albcit not much in most R:TOSs--and they will not work for interrupt routines.

3. Disabling task switches is somewhere in between the two, It has no effect on interrupt routines, but it stops response for all other tasks cold,

Chapter Summary

• A tvpical real-time operating system (RTOS) is smaller and offers fewer services than a standard operating system, and it is more closely linked to the apphCJtJOIL

R:T OS, are widely available for sale, and it generally makes sense to buy one rather than to wr ite one yourself.

• The task is the mam building block for software written for an RTOS environmen r.

I Each task IS always inone .of three states: running, ready, and blocked. The scheduler in the RTOS runs the highest-priority ready task.

E Each task has its own stack; however, other data in the system is shared by all tasks. Therefore, the shared data problem can reappear.

PROBUMS

169

I A function that works properly even if it is called by more thari one task is called a reentrant function.

I Semaphores can solve the shared-data problem. Since only one task can take a semaphore at a time, semaphores can prevent shared data from causing bugs. Semaphores have two associated functions-take and release,

I Your tasks can lise semaphores to signal one another.

I :Oll c~n introduce any number of ornery bugs with semaphores. Priority inversion and dt~adly embrace are two of the more obscure, Forgetting to take or release a semaphore or using the wrong one are more common ways to cause yourself problems.

I The mutex, the binary semaphore, and the counting semaphore are among the most common semaphore variants that RTOSs offer.

I Three methods to protect shared data are disabling interrupts, taking semaphores, and disabling task switches.

Problems

1. Is this function reentrant?

int cErrors;

void vCountErrors (int cNewErrors) {

cErrors +~ cNewErrors;

2. Is this function reentrant?

int strlen (char *p_sz)

i nt iLength;

iLength ~ 0;

while (*p_sz l= '\0') {

++iLength; ++p.cs z :

return i l.ength;

170 iNTRODUCTION TO REAL-TIME ()PER~TING SYSTEMS

3. Which of the numbered lines (lines 1-5) in the following function would lead you to suspect that this function is probably not reentrant.

static int 1Count;

void vNotReentrant (int x. int .p) {

int y;

/* Line 1 */ y = x * 2;
/* Line 2 */ ++p;
1* Line 3 */ *p = 123 ;
/* Line 4 */ iCount += 234;
1* Line 5 *1 printf ("\nNew count: %d". x ) ;
} 4. The following routines are called by Tasks A, B, and C, but they don't work.

How would you fix the problems?

static int iRecordCount;

void increment_records (int iCount)

OSSemGet (SEMAPHORE_PLUS): iRecordCount +- iCount;

void decrement_records (int iCount) [

i Re.cordCount -= i Count; OSSemGive (SEMAPHORE_MINUS);

5. Where do you need to take and release the semaphores in the following code to make the function reentrant?

static int iValue;

int iFixValue (int iParm) {

i nt iTemp.;

iTemp = iValue; iTemp += iParm * 17;

PROBLEMS

171

if {iTemp > 4922) iTemp = i Parm; iValue = iTemp;

iParm = iTemp + 179; if (iParm· < 2000) return 1;

else

return 0;

6. For each of the following situations, discuss which of the three shared-data protection mechanisms seems most likely to be best and explain why.

(a.) TaskM and Task N share an i nt array, and each often must update many elements in the array.

(b.) Task P shares a single cha r variable with one of the interrupt routines.

7. The task code and the interrupt routine in Figure 6.16 share the variables i L i ne s+r tnt ed and i Lines Tota 1, but the task does not disable interrupts when it uses them. Is this a problem? Why or why not?

8. Assume that the folloWing code is the only code in the system that uses the variable i Sha redDev i ceXData. The routine vGetData F romDev tc e X is an interrupt routine. Now suppose that instead of disabling all interrupts in: vTas kZ,as shown below; we disable only the device X interrupt, allowing all other interrupts. Will this still protect the i SharedDevi ceXOata variable? If not, why not? If so, what are the advantages (if any) and disadvantages (if any) of doing this compared to disabling all interrupts?

int iSharedDeviteXData:

void interrupt vGetDataFromDeviceX (void) {

iSharedDeviceXData = II Get data from device X hardware !I reset hardware

void vTaskZ (void)

1* Low priority task */

int iTemp;

whi 1 e ("FOREVER)

!Idisable Interrupts

172 [;~ROD~-Z;!oN TO REAL-TIME OPERATlc-lG SYSTEMS

iTemp - iSharedDeviceXData; lienable interrupts

! !compute with iTemp

9. A nonpreemptive RTOS will let a low-priority task continue to run, even when a higher-priority task becomes ready. This makes its response characteristics much more like those of one of the architectures we discussed in Chapter 5 than like those of a preemptive RTOS. Which of those architectures is most similar III its response characteristics to a nonpreemptive RTOS?

10. Consider this statement: "In a nonpreemptive RTOS, tasks cannot 'interrupt' one another: therefore there are no data-sharing problems among tasks." Would you agree with this?

7.1

More Operating System Services

T his chapter covers the other features commonly offered by commercial RTOSs. We'll discuss intertask communication, timer services, memory management, events, and the interaction between interrupt routines and RTOSs.

Message Queues, Mailboxes, and Pipes

Tasks must be able to communicate with one another to coordinate their activities or to share data. For example. in the underground tank monitoring system the task that calculates the amount of gas in the tanks must let other parts of the system know how much gasoline there is. In Telegraph, the system we discussed in Chapter 1 that connects a serial-port printer to a network, the tasks that receive data on the network must hand that data off to other tasks that pass the data on to the printer or that determine responses to send on the network.

In Chapter 6 we discussed using shared data and semaphores to allow tasks to communicate with one another. In this section we will discuss several other methods that mostRTOSs offer: queues, mailboxes, and pipes.

Here's a very simple example. Suppose that we have two tasks, Taskl and Ta s k2, each of which has a number of high-priority, urgent things to do. Suppose also that from time to time these two tasks discover error conditions that must be reported on a network, a time-consuming process. In order not to delay Taskl and Task2, it makes sense to have a separate task, ErrorsTask, that is responsible for reporting the error conditions on the network. Whenever Tas kl or Task2 discovers an error, it reports that error to ErrorsTask and then goes on about

1

---_ ---------------

its own husiness. The error reporting process undertaken by ErrorsTask does not delay the other tasks.

An I~TOS queue is the way to implement this design. Figure 7.1 shows

how it is done. In Figure 7.1, when Taskl or Task2 needs to log errors, it calls vl.ogError. The vLogError function puts the error on a queue of errors for

ErrorsTask to deal with. !

The AddToOueue function adds (many people use the term posts) the valu:

of the integer parameter it is passed to a queue of integer values the RTOS maintains internally. The ReadFromQueue function reads the value at the head queue and returns it to the caller. If the queue is empty, ReadFromOueue

Figure 7.1 Simple Use of a Queue

1* RTOS queue function prototypes *1 void AddToQueue (int iData);

void ReadFromOueue (int *p_iData);

void Taskl (void) {

if (! !prob7em arises) vLogError (ERROR_TYPE-X);

!f Other things that need to be done soon.

void Task2 (void) {

if (!!problem arises) vLogError (ER~OR_TYPE_Y);

!! Other things that need to be done soon.

(continued)

7.1 MESSAGE QUEUES, MAILBOXES, AND PIPES 175

Figure 7.1 (continued)

void vLogError (int iErrorType) {

AddToOueue (iErrorType);

static int cErrors;

void ErrorsTask (void) {

lnt lErrorType;

while (FOREVER) {

ReadFromQueue "(&iErrorType); ++cErrors;

11 Send cErrors and 1ErrorType out on network

blocks the calling task. The RTOS guarantees that both of these functions are reentrant. If the RTOS switches from Taskl to Task2 when Taskl is in the middle of AddToOueue. and if Task2 subsequently calls AddToQueue, the RTOS ensures that things still work. Each time ErrorsTask calls ReadFromOueue, it gets the next error from the queue, even if the RTOS switches from ErrorsTask to Taskl to Task2 and back again in the middle of the call.

Some Ugly Details

As you've no doubt guessed. queues are not quite as simple as the two functions illustrated in Figure 7.1. Here are some of the complications that you will have to deal with in most RTOSs:

I Most RTOSs require that you initialize your queues before you use them, by calling a function provided for this purpose. On some systems,' it is also up to you to allocate. the memory that the RTOS will manage as a queue. As with semaphores, it makes most sense to initialize queues in some code that is guaranteed to run before any task tries to use them.

176

Mota OPE1LlTING SyS tEM Sr.n vr c as

• Since most RTOSs allow you to have as many queues ;JS you want, you pass an additional parameter to every queue function: the ideutiry of the queue to which you want to write or from which you want to read. Various systems do this in various ways.

If your code tries to write to a queue when the queue is full, the RTOS must either return an errol' to let you know that the write operation failed (a more common RTOS behavior) or it must block the task until some other task reads data from the queue and thereby creates some space (a less common RTOS behavior). Your code must deal with whichever of these behaviors your RTOS exhibits.

I Many RTOSs include J function that will read from a queue if there is any data and will return an error code if not. This function is in addition to the one that will block your task if the queue is empty.

The amount of data that the 1-1.:1'OS lets you write to the queue in one call may not be the amount that you want to write. Many RTOSs are inflexible about this. One common RTOS characteristic is to allow you to write onto a queue in one call the number of bytes taken up by a void pointer.

Figure 7.2 is the same program as Figure 7.1, except with more realistic RTOS function calls, the calls used in /-LC/OS.

Pointers and Queues

Figure 7.2 illustrates one fairly common RIOS interface, which allows you to write one void pointer to the queue with each call. It also illustrates the fairly common coding technique people use to senda small amount of data: casting that data as a void pointer. The obvious idea behind this style ofRTOS interface is that one task CdT! pass any amount of data to another task by putting the data into a butter and then writing a pointer to th~ buffer onto the queue. Figure 7.3 illustrates this latter technique. The vReadTempetaturesTask task calls the C library rna 11 oc function to allocatea newdata buffer for each pair of temperatures and writes ,1 pointer to that bufferinto the queue. vMa i n Tas k subsequently rends the pointer to the buffer from the queue, compares the temperatures, and frees the buffer.

Mailboxes

In general, mailboxes are much like queues. The typical RfOS has functions to create, to write to, and to read from mailboxes, and perhaps functions to check

177

7·1 MESSAGE QUEUES, MA1LIJUXLS. ASD

Figure 7.2 More Realistic Use of a Queue 1* RTOS queue function prototypes *1

OS_EVENT *OSQCreate (void '*ppSta~t, BYTE bYSize); unsigned char OSQPost (OS_EVENT 'pOse, void *pvMsg):

void *OSQPend (OS_EVENT ·pOse,WDRDwTimeout, BYTE *pByErr); #define WAIT_FOREVER 0

/* Our mesSage queue */ static OS_EVENT *pOseQueue;

/* The data space for our queue. The RTOS wi 11 manage thi s , *1 Iidefi ne SIZEOF _QUEUE 25

void *apvQueue[SIZEOF.QUEUE];

void main (void)

/* The queue gets initialized before the tasks are started */ pOseOueue = OSQCreate (apvQueue, SIZEOF._QUEUE);

J I Start Taskl I! Start Task2

void Taskl (void)

if (1lprob7em arises) vLogError (ERROR_TYPE_.X);

Ii Other things that need to be done soon.

void Task2 (void) {

(continued)

You might also like