Chapter - 03: Accessing I/O Devices

Download as pdf or txt
Download as pdf or txt
You are on page 1of 42

Chapter 03

ACCESSING I/O DEVICES: A simple arrangement to connect I/O devices to a computer is to use a single bus
arrangement. The bus enables all the devices connected to it to exchange information.
Typically, it consists of three sets of lines used to carry address, data, and control signals.
Each I/O device is assigned a unique set of addresses. When the processor places a
particular address on the address line, the device that recognizes this address responds to
the commands issued on the control lines. The processor requests either a read or a write
operation, and the requested data are transferred over the data lines, when I/O devices and
the memory share the same address space, the arrangement is called memory-mapped
I/O.
With memory-mapped I/O, any machine instruction that can access memory can
be used to transfer data to or from an I/O device. For example, if DATAIN is the address
of the input buffer associated with the keyboard, the instruction
Move DATAIN, R0
Reads the data from DATAIN and stores them into processor register R0. Similarly, the
instruction
Move R0, DATAOUT
Sends the contents of register R0 to location DATAOUT, which may be the output data
buffer of a display unit or a printer.
Most computer systems use memory-mapped I/O. some processors have special
In and Out instructions to perform I/O transfers. When building a computer system based
on these processors, the designer had the option of connecting I/O devices to use the
special I/O address space or simply incorporating them as part of the memory address
space. The I/O devices examine the low-order bits of the address bus to determine
whether they should respond.
The hardware required to connect an I/O device to the bus. The address decoder
enables the device to recognize its address when this address appears on the address lines.
The data register holds the data being transferred to or from the processor. The status
register contains information relevant to the operation of the I/O device. Both the data
and status registers are connected to the data bus and assigned unique addresses. The
address decoder, the data and status registers, and the control circuitry required to
coordinate I/O transfers constitute the devices interface circuit.
I/O devices operate at speeds that are vastly different from that of the processor.
When a human operator is entering characters at a keyboard, the processor is capable of
executing millions of instructions between successive character entries. An instruction
that reads a character from the keyboard should be executed only when a character is

available in the input buffer of the keyboard interface. Also, we must make sure that an
input character is read only once.
This example illustrates program-controlled I/O, in which the processor
repeatedly checks a status flag to achieve the required synchronization between the
processor and an input or output device. We say that the processor polls the device. There
are two other commonly used mechanisms for implementing I/O operations: interrupts
and direct memory access. In the case of interrupts, synchronization is achieved by
having the I/O device send a special signal over the bus whenever it is ready for a data
transfer operation. Direct memory access is a technique used for high-speed I/O devices.
It involves having the device interface transfer data directly to or from the memory,
without continuous involvement by the processor.
The routine executed in response to an interrupt request is called the interruptservice routine, which is the PRINT routine in our example. Interrupts bear considerable
resemblance to subroutine calls. Assume that an interrupt request arrives during
execution of instruction i in figure 1

Program 1
COMPUTER routine

Program 2
PRINT routine

1
2
....
Interrupt
Occurs
here

i
i+1

Figure 1. Transfer of control through the use of interrupts


The processor first completes execution of instruction i. Then, it loads the
program counter with the address of the first instruction of the interrupt-service routine.
For the time being, let us assume that this address is hardwired in the processor. After
execution of the interrupt-service routine, the processor has to come back to instruction
i +1. Therefore, when an interrupt occurs, the current contents of the PC, which point to
instruction i+1, must be put in temporary storage in a known location. A Return-frominterrupt instruction at the end of the interrupt-service routine reloads the PC from the
temporary storage location, causing execution to resume at instruction i +1. In many
processors, the return address is saved on the processor stack.
2

We should note that as part of handling interrupts, the processor must inform the
device that its request has been recognized so that it may remove its interrupt-request
signal. This may be accomplished by means of a special control signal on the bus. An
interrupt-acknowledge signal. The execution of an instruction in the interrupt-service
routine that accesses a status or data register in the device interface implicitly informs
that device that its interrupt request has been recognized.
So far, treatment of an interrupt-service routine is very similar to that of a
subroutine. An important departure from this similarity should be noted. A subroutine
performs a function required by the program from which it is called. However, the
interrupt-service routine may not have anything in common with the program being
executed at the time the interrupt request is received. In fact, the two programs often
belong to different users. Therefore, before starting execution of the interrupt-service
routine, any information that may be altered during the execution of that routine must be
saved. This information must be restored before execution of the interrupt program is
resumed. In this way, the original program can continue execution without being affected
in any way by the interruption, except for the time delay. The information that needs to
be saved and restored typically includes the condition code flags and the contents of any
registers used by both the interrupted program and the interrupt-service routine.
The task of saving and restoring information can be done automatically by the
processor or by program instructions. Most modern processors save only the minimum
amount of information needed to maintain the registers involves memory transfers that
increase the total execution time, and hence represent execution overhead. Saving
registers also increase the delay between the time an interrupt request is received and the
start of execution of the interrupt-service routine. This delay is called interrupt latency.
In some earlier processors, particularly those with a small number of registers, all
registers are saved automatically by the processor hardware at the time an interrupt
request is accepted. The data saved are restored to their respective registers as part of the
execution of the Return-from interrupt instruction. Some computers provide two types of
interrupts. One saves all register contents, and the other does not. A particular I/O device
may use either type, depending upon its response-time requirements. Another interesting
approach is to provide duplicate sets of processor registers. In this case, a different set of
registers can be used by the interrupt-service routine, thus eliminating the need to save
and restore registers.
INTERRUPT HARDWARE:We pointed out that an I/O device requests an interrupt by activating a bus line
called interrupt-request. Most computers are likely to have several I/O devices that can
request an interrupt. A single interrupt-request line may be used to serve n devices as
depicted. All devices are connected to the line via switches to ground. To request an
interrupt, a device closes its associated switch. Thus, if all interrupt-request signals
INTR1 to INTRn are inactive, that is, if all switches are open, the voltage on the interrupt3

request line will be equal to Vdd. This is the inactive state of the line. Since the closing of
one or more switches will cause the line voltage to drop to 0, the value of INTR is the
logical OR of the requests from individual devices, that is,
INTR = INTR1 + +INTRn
It is customary to use the complemented form, INTR , to name the interrupt-request
signal on the common line, because this signal is active when in the low-voltage state.
ENABLING AND DISABLING INTERRUPTS:The facilities provided in a computer must give the programmer complete control
over the events that take place during program execution. The arrival of an interrupt
request from an external device causes the processor to suspend the execution of one
program and start the execution of another. Because interrupts can arrive at any time,
they may alter the sequence of events from the envisaged by the programmer. Hence, the
interruption of program execution must be carefully controlled.
Let us consider in detail the specific case of a single interrupt request from one
device. When a device activates the interrupt-request signal, it keeps this signal activated
until it learns that the processor has accepted its request. This means that the interruptrequest signal will be active during execution of the interrupt-service routine, perhaps
until an instruction is reached that accesses the device in question.
The first possibility is to have the processor hardware ignore the interrupt-request
line until the execution of the first instruction of the interrupt-service routine has been
completed. Then, by using an Interrupt-disable instruction as the first instruction in the
interrupt-service routine, the programmer can ensure that no further interruptions will
occur until an Interrupt-enable instruction is executed. Typically, the Interrupt-enable
instruction will be the last instruction in the interrupt-service routine before the Returnfrom-interrupt instruction. The processor must guarantee that execution of the Returnfrom-interrupt instruction is completed before further interruption can occur.
The second option, which is suitable for a simple processor with only one
interrupt-request line, is to have the processor automatically disable interrupts before
starting the execution of the interrupt-service routine. After saving the contents of the PC
and the processor status register (PS) on the stack, the processor performs the equivalent
of executing an Interrupt-disable instruction. It is often the case that one bit in the PS
register, called Interrupt-enable, indicates whether interrupts are enabled.
In the third option, the processor has a special interrupt-request line for which the
interrupt-handling circuit responds only to the leading edge of the signal. Such a line is
said to be edge-triggered.

Before proceeding to study more complex aspects of interrupts, let us summarize


the sequence of events involved in handling an interrupt request from a single device.
Assuming that interrupts are enabled, the following is a typical scenario.
1. The device raises an interrupt request.
2. The processor interrupts the program currently being executed.
3. Interrupts are disabled by changing the control bits in the PS (except in the case of
edge-triggered interrupts).
4. The device is informed that its request has been recognized, and in response, it
deactivates the interrupt-request signal.
5. The action requested by the interrupt is performed by the interrupt-service routine.
6. Interrupts are enabled and execution of the interrupted program is resumed.
HANDLING MULTIPLE DEVICES:Let us now consider the situation where a number of devices capable of initiating
interrupts are connected to the processor. Because these devices are operationally
independent, there is no definite order in which they will generate interrupts. For
example, device X may request in interrupt while an interrupt caused by device Y is
being serviced, or several devices may request interrupts at exactly the same time. This
gives rise to a number of questions
1. How can the processor recognize the device requesting an interrupts?
2. Given that different devices are likely to require different interrupt-service
routines, how can the processor obtain the starting address of the appropriate
routine in each case?
3. Should a device be allowed to interrupt the processor while another interrupt is
being serviced?
4. How should two or more simultaneous interrupt requests be handled?
The means by which these problems are resolved vary from one computer to another,
And the approach taken is an important consideration in determining the computers
suitability for a given application.
When a request is received over the common interrupt-request line, additional
information is needed to identify the particular device that activated the line.
The information needed to determine whether a device is requesting an interrupt
is available in its status register. When a device raises an interrupt request, it sets to 1 one
of the bits in its status register, which we will call the IRQ bit. For example, bits KIRQ
and DIRQ are the interrupt request bits for the keyboard and the display, respectively.
The simplest way to identify the interrupting device is to have the interrupt-service
routine poll all the I/O devices connected to the bus. The first device encountered with its
IRQ bit set is the device that should be serviced. An appropriate subroutine is called to
provide the requested service.

The polling scheme is easy to implement. Its main disadvantage is the time spent
interrogating the IRQ bits of all the devices that may not be requesting any service. An
alternative approach is to use vectored interrupts, which we describe next.
Vectored Interrupts:To reduce the time involved in the polling process, a device requesting an
interrupt may identify itself directly to the processor. Then, the processor can
immediately start executing the corresponding interrupt-service routine. The term
vectored interrupts refers to all interrupt-handling schemes based on this approach.
A device requesting an interrupt can identify itself by sending a special code to
the processor over the bus. This enables the processor to identify individual devices even
if they share a single interrupt-request line. The code supplied by the device may
represent the starting address of the interrupt-service routine for that device. The code
length is typically in the range of 4 to 8 bits. The remainder of the address is supplied by
the processor based on the area in its memory where the addresses for interrupt-service
routines are located.
This arrangement implies that the interrupt-service routine for a given device
must always start at the same location. The programmer can gain some flexibility by
storing in this location an instruction that causes a branch to the appropriate routine.
Interrupt Nesting: Interrupts should be disabled during the execution of an interrupt-service routine,
to ensure that a request from one device will not cause more than one interruption. The
same arrangement is often used when several devices are involved, in which case
execution of a given interrupt-service routine, once started, always continues to
completion before the processor accepts an interrupt request from a second device.
Interrupt-service routines are typically short, and the delay they may cause is acceptable
for most simple devices.
For some devices, however, a long delay in responding to an interrupt request
may lead to erroneous operation. Consider, for example, a computer that keeps track of
the time of day using a real-time clock. This is a device that sends interrupt requests to
the processor at regular intervals. For each of these requests, the processor executes a
short interrupt-service routine to increment a set of counters in the memory that keep
track of time in seconds, minutes, and so on. Proper operation requires that the delay in
responding to an interrupt request from the real-time clock be small in comparison with
the interval between two successive requests. To ensure that this requirement is satisfied
in the presence of other interrupting devices, it may be necessary to accept an interrupt
request from the clock during the execution of an interrupt-service routine for another
device.

This example suggests that I/O devices should be organized in a priority


structure. An interrupt request from a high-priority device should be accepted while the
processor is servicing another request from a lower-priority device.
A multiple-level priority organization means that during execution of an
interrupt-service routine, interrupt requests will be accepted from some devices but not
from others, depending upon the devices priority. To implement this scheme, we can
assign a priority level to the processor that can be changed under program control. The
priority level of the processor is the priority of the program that is currently being
executed. The processor accepts interrupts only from devices that have priorities higher
than its own.
The processors priority is usually encoded in a few bits of the processor status
word. It can be changed by program instructions that write into the PS. These are
privileged instructions, which can be executed only while the processor is running in the
supervisor mode. The processor is in the supervisor mode only when executing operating
system routines. It switches to the user mode before beginning to execute application
programs. Thus, a user program cannot accidentally, or intentionally, change the priority
of the processor and disrupt the systems operation. An attempt to execute a privileged
instruction while in the user mode leads to a special type of interrupt called a privileged
instruction.

Processor

A multiple-priority scheme can be implemented easily by using separate


interrupt-request and interrupt-acknowledge lines for each device, as shown in figure.
Each of the interrupt-request lines is assigned a different priority level. Interrupt requests
received over these lines are sent to a priority arbitration circuit in the processor. A
request is accepted only if it has a higher priority level than that currently assigned to the
processor.

INTR p
INTR 1

Device 1
INTA 1

Device 2

Device p
INTA p

Priority arbitration Circuit

Figure2: Implementation of interrupt priority using individual interrupt-request and


acknowledge lines.
Simultaneous Requests:Let us now consider the problem of simultaneous arrivals of interrupt requests
from two or more devices. The processor must have some means of deciding which
requests to service first. Using a priority scheme such as that of figure, the solution is
straightforward. The processor simply accepts the requests having the highest priority.
Polling the status registers of the I/O devices is the simplest such mechanism. In
this case, priority is determined by the order in which the devices are polled. When
vectored interrupts are used, we must ensure that only one device is selected to send its
interrupt vector code. A widely used scheme is to connect the devices to form a daisy
chain, as shown in figure 3a. The interrupt-request line INTR is common to all devices.
The interrupt-acknowledge line, INTA, is connected in a daisy-chain fashion, such that
the INTA signal propagates serially through the devices.

Processor

INTR

Device 1

Device 2

Device n

INTA
(3.a) Daisy chain

Processor

INTR 1

Device

Device

Device

Device

INTA1

INTR p

INTA p
Priority arbitration
Circuit

(3.b) Arrangement of priority groups

When several devices raise an interrupt request and the INTR line is activated,
the processor responds by setting the INTA line to 1. This signal is received by device 1.
Device 1 passes the signal on to device 2 only if it does not require any service. If device
1 has a pending request for interrupt, it blocks the INTA signal and proceeds to put its
identifying code on the data lines. Therefore, in the daisy-chain arrangement, the device
that is electrically closest to the processor has the highest priority. The second device
along the chain has second highest priority, and so on.
The scheme in figure 3.a requires considerably fewer wires than the individual
connections in figure 2. The main advantage of the scheme in figure 2 is that it allows the
processor to accept interrupt requests from some devices but not from others, depending
upon their priorities. The two schemes may be combined to produce the more general
structure in figure 3b. Devices are organized in groups, and each group is connected at a
different priority level. Within a group, devices are connected in a daisy chain. This
organization is used in many computer systems.
Controlling Device Requests: Until now, we have assumed that an I/O device interface generates an interrupt
request whenever it is ready for an I/O transfer, for example whenever the SIN flag is 1.
It is important to ensure that interrupt requests are generated only by those I/O devices
that are being used by a given program. Idle devices must not be allowed to generate
interrupt requests, even though they may be ready to participate in I/O transfer
operations. Hence, we need a mechanism in the interface circuits of individual devices to
control whether a device is allowed to generate an interrupt request.
The control needed is usually provided in the form of an interrupt-enable bit in
the devices interface circuit. The keyboard interrupt-enable, KEN, and display interruptenable, DEN, flags in register CONTROL perform this function. If either of these flags
is set, the interface circuit generates an interrupt request whenever the corresponding
status flag in register STATUS is set. At the same time, the interface circuit sets bit KIRQ
or DIRQ to indicate that the keyboard or display unit, respectively, is requesting an
interrupt. If an interrupt-enable bit is equal to 0, the interface circuit will not generate an
interrupt request, regardless of the state of the status flag.
To summarize, there are two independent mechanisms for controlling interrupt
requests. At the device end, an interrupt-enable bit in a control register determines
whether the device is allowed to generate an interrupt request. At the processor end,
either an interrupt enable bit in the PS register or a priority structure determines whether
a given interrupt request will be accepted.
Exceptions:An interrupt is an event that causes the execution of one program to be suspended
and the execution of another program to begin. So far, we have dealt only with interrupts
9

caused by requests received during I/O data transfers. However, the interrupt mechanism
is used in a number of other situations.
The term exception is often used to refer to any event that causes an interruption.
Hence, I/O interrupts are one example of an exception. We now describe a few other
kinds of exceptions.
Recovery from Errors:Computers use a variety of techniques to ensure that all hardware components are
operating properly. For example, many computers include an error-checking code in the
main memory, which allows detection of errors in the stored data. If errors occur, the
control hardware detects it and informs the processor by raising an interrupt.
The processor may also interrupt a program if it detects an error or an unusual
condition while executing the instructions of this program. For example, the OP-code
field of an instruction may not correspond to any legal instruction, or an arithmetic
instruction may attempt a division by zero.
When exception processing is initiated as a result of such errors, the processor
proceeds in exactly the same manner as in the case of an I/O interrupt request. It suspends
the program being executed and starts an exception-service routine. This routine takes
appropriate action to recover from the error, if possible, or to inform the user about it.
Recall that in the case of an I/O interrupt, the processor completes execution of the
instruction in progress before accepting the interrupt. However, when an interrupt is
caused by an error, execution of the interrupted instruction cannot usually be completed,
and the processor begins exception processing immediately.
Debugging:Another important type of exception is used as an aid in debugging programs.
System software usually includes a program called a debugger, which helps the
programmer find errors in a program. The debugger uses exceptions to provide two
important facilities called trace and breakpoints.
When a processor is operating in the trace mode, an exception occurs after
execution of every instruction, using the debugging program as the exception-service
routine. The debugging program enables the user to examine the contents of registers,
memory locations, and so on. On return from the debugging program, the next instruction
in the program being debugged is executed, then the debugging program is activated
again. The trace exception is disabled during the execution of the debugging program.
Breakpoint provides a similar facility, except that the program being debugged is
interrupted only at specific points selected by the user. An instruction called Trap or
Software-interrupt is usually provided for this purpose. Execution of this instruction
results in exactly the same actions as when a hardware interrupt request is received.
10

While debugging a program, the user may wish to interrupt program execution after
instruction i. The debugging routine saves instruction i+1 and replaces it with a software
interrupt instruction. When the program is executed and reaches that point, it is
interrupted and the debugging routine is activated. This gives the user a chance to
examine memory and register contents. When the user is ready to continue executing the
program being debugged, the debugging routine restores the saved instruction that was a
location i+1 and executes a Return-from-interrupt instruction.
Privilege Exception:To protect the operating system of a computer from being corrupted by user
programs, certain instructions can be executed only while the processor is in supervisor
mode. These are called privileged instructions. For example, when the processor is
running in the user mode, it will not execute an instruction that changes the priority level
of the processor or that enables a user program to access areas in the computer memory
that have been allocated to other users. An attempt to execute such an instruction will
produce a privilege exceptions, causing the processor to switch to the supervisor mode
and begin executing an appropriate routine in the operating system.
DIRECT MEMORY ACCESS:The discussion in the previous sections concentrates on data transfer between the
processor and I/O devices. Data are transferred by executing instructions such as
Move DATAIN, R0
An instruction to transfer input or output data is executed only after the processor
determines that the I/O device is ready. To do this, the processor either polls a status flag
in the device interface or waits for the device to send an interrupt request. In either case,
considerable overhead is incurred, because several program instructions must be executed
for each data word transferred. In addition to polling the status register of the device,
instructions are needed for incrementing the memory address and keeping track of the
word count. When interrupts are used, there is the additional overhead associated with
saving and restoring the program counter and other state information.
To transfer large blocks of data at high speed, an alternative approach is used. A
special control unit may be provided to allow transfer of a block of data directly between
an external device and the main memory, without continuous intervention by the
processor. This approach is called direct memory access, or DMA.
DMA transfers are performed by a control circuit that is part of the I/O device
interface. We refer to this circuit as a DMA controller. The DMA controller performs the
functions that would normally be carried out by the processor when accessing the main
memory. For each word transferred, it provides the memory address and all the bus
signals that control data transfer. Since it has to transfer blocks of data, the DMA

11

controller must increment the memory address for successive words and keep track of the
number of transfers.
Although a DMA controller can transfer data without intervention by the
processor, its operation must be under the control of a program executed by the
processor. To initiate the transfer of a block of words, the processor sends the starting
address, the number of words in the block, and the direction of the transfer. On receiving
this information, the DMA controller proceeds to perform the requested operation. When
the entire block has been transferred, the controller informs the processor by raising an
interrupt signal.
While a DMA transfer is taking place, the program that requested the transfer
cannot continue, and the processor can be used to execute another program. After the
DMA transfer is completed, the processor can return to the program that requested the
transfer.
I/O operations are always performed by the operating system of the computer in
response to a request from an application program. The OS is also responsible for
suspending the execution of one program and starting another. Thus, for an I/O operation
involving DMA, the OS puts the program that requested the transfer in the Blocked state,
initiates the DMA operation, and starts the execution of another program. When the
transfer is completed, the DMA controller informs the processor by sending an interrupt
request. In response, the OS puts the suspended program in the Runnable state so that it
can be selected by the scheduler to continue execution.
Figure 4 shows an example of the DMA controller registers that are accessed by
the processor to initiate transfer operations. Two registers are used for storing the

Status and Control

31

30

IRQ

Done
IE

R/W

Starting address

Word count

Figure 4 Registers in DMA interface


12

Main
memory

Processor

System bus

Disk/DMA
controller

Disk

DMA
controller

Disk

Printer

Keyboard

Network
Interface

Figure 5 Use of DMA controllers in a computer system


Starting address and the word count. The third register contains status and control flags.
The R/W bit determines the direction of the transfer. When this bit is set to 1 by a
program instruction, the controller performs a read operation, that is, it transfers data
from the memory to the I/O device. Otherwise, it performs a write operation. When the
controller has completed transferring a block of data and is ready to receive another
command, it sets the Done flag to 1. Bit 30 is the Interrupt-enable flag, IE. When this flag
is set to 1, it causes the controller to raise an interrupt after it has completed transferring a
block of data. Finally, the controller sets the IRQ bit to 1 when it has requested an
interrupt.
An example of a computer system is given in above figure, showing how DMA
controllers may be used. A DMA controller connects a high-speed network to the
computer bus. The disk controller, which controls two disks, also has DMA capability
and provides two DMA channels. It can perform two independent DMA operations, as if
each disk had its own DMA controller. The registers needed to store the memory address,
the word count, and so on are duplicated, so that one set can be used with each device.

13

To start a DMA transfer of a block of data from the main memory to one of the
disks, a program writes the address and word count information into the registers of the
corresponding channel of the disk controller. It also provides the disk controller with
information to identify the data for future retrieval. The DMA controller proceeds
independently to implement the specified operation. When the DMA transfer is
completed. This fact is recorded in the status and control register of the DMA channel by
setting the Done bit. At the same time, if the IE bit is set, the controller sends an interrupt
request to the processor and sets the IRQ bit. The status register can also be used to
record other information, such as whether the transfer took place correctly or errors
occurred.
Memory accesses by the processor and the DMA controller are interwoven.
Requests by DMA devices for using the bus are always given higher priority than
processor requests. Among different DMA devices, top priority is given to high-speed
peripherals such as a disk, a high-speed network interface, or a graphics display device.
Since the processor originates most memory access cycles, the DMA controller can be
said to steal memory cycles from the processor. Hence, the interweaving technique is
usually called cycle stealing. Alternatively, the DMA controller may be given exclusive
access to the main memory to transfer a block of data without interruption. This is known
as block or burst mode.
Most DMA controllers incorporate a data storage buffer. In the case of the
network interface in figure 5 for example, the DMA controller reads a block of data from
the main memory and stores it into its input buffer. This transfer takes place using burst
mode at a speed appropriate to the memory and the computer bus. Then, the data in the
buffer are transmitted over the network at the speed of the network.
A conflict may arise if both the processor and a DMA controller or two DMA
controllers try to use the bus at the same time to access the main memory. To resolve
these conflicts, an arbitration procedure is implemented on the bus to coordinate the
activities of all devices requesting memory transfers.
Bus Arbitration:The device that is allowed to initiate data transfers on the bus at any given time is
called the bus master. When the current master relinquishes control of the bus, another
device can acquire this status. Bus arbitration is the process by which the next device to
become the bus master is selected and bus mastership is transferred to it. The selection of
the bus master must take into account the needs of various devices by establishing a
priority system for gaining access to the bus.
There are two approaches to bus arbitration: centralized and distributed. In
centralized arbitration, a single bus arbiter performs the required arbitration. In
distributed arbitration, all devices participate in the selection of the next bus master.

14

Centralized Arbitration:The bus arbiter may be the processor or a separate unit connected to the bus. A
basic arrangement in which the processor contains the bus arbitration circuitry. In this
case, the processor is normally the bus master unless it grants bus mastership to one of
the DMA controllers. A DMA controller indicates that it needs to become the bus master
by activating the Bus-Request line, BR . The signal on the Bus-Request line is the logical
OR of the bus requests from all the devices connected to it. When Bus-Request is
activated, the processor activates the Bus-Grant signal, BG1, indicating to the DMA
controllers that they may use the bus when it becomes free. This signal is connected to all
DMA controllers using a daisy-chain arrangement. Thus, if DMA controller 1 is
requesting the bus, it blocks the propagation of the grant signal to other devices.
Otherwise, it passes the grant downstream by asserting BG2. The current bus master
indicates to all device that it is using the bus by activating another open-controller line
called Bus-Busy, BBSY . Hence, after receiving the Bus-Grant signal, a DMA controller
waits for Bus-Busy to become inactive, then assumes mastership of the bus. At this time,
it activates Bus-Busy to prevent other devices from using the bus at the same time.
Distributed Arbitration:-

Vcc

ARB3
ARB2
ARB1

ARB0
Start-Arbitration

O.C

Interface circuit
for device A

Fig 6 A distributed arbitration


Distributed arbitration means that all devices waiting to use the bus have equal
responsibility in carrying out the arbitration process, without using a central arbiter. A
15

simple method for distributed arbitration is illustrated in figure 6. Each device on the bus
assigned a 4-bit identification number. When one or more devices request the bus, they
assert the Start Arbitration signal and place their 4-bit ID numbers on four opencollector lines, ARB0 through ARB3 . A winner is selected as a result of the interaction
among the signals transmitted over those liens by all contenders. The net outcome is that
the code on the four lines represents the request that has the highest ID number.
Buses:The processor, main memory, and I/O devices can be interconnected by means of
a common bus whose primary function is to provide a communication path for the
transfer of data. The bus includes the lines needed to support interrupts and arbitration. In
this section, we discuss the main features of the bus protocols used for transferring data.
A bus protocol is the set of rules that govern the behavior of various devices connected to
the bus as to when to place information on the bus, assert control signals, and so on. After
describing bus protocols, we will present examples of interface circuits that use these
protocols.

Synchronous Bus:In a synchronous bus, all devices derive timing information from a common
clock line. Equally spaced pulses on this line define equal time intervals. In the simplest
form of a synchronous bus, each of these intervals constitutes a bus cycle during which
one data transfer can take place. Such a scheme is illustrated in figure 7 The address and
data lines in this and subsequent figures are shown as high and low at the same time. This
is a common convention indicating that some lines are high and some low, depending on
the particular address or data pattern being transmitted. The crossing points indicate the
times at which these patterns change. A signal line in an indeterminate or high impedance
state is represented by an intermediate level half-way between the low and high signal
levels.
Let us consider the sequence of events during an input (read) operation. At time
t0, the master places the device address on the address lines and sends an appropriate
command on the control lines. In this case, the command will indicate an input operation
and specify the length of the operand to be read, if necessary. Information travels over the
bus at a speed determined by its physical and electrical characteristics. The clock pulse
width, t1 t0, must be longer than the maximum propagation delay between two devices
connected to the bus. It also has to be long enough to allow all devices to decode the
address and control signals so that the addressed device (the slave) can respond at time t 1.
It is important that slaves take no action or place any data on the bus before t 1. The
information on the bus is unreliable during the period t0 to t1 because signals are changing
state. The addressed slave places the requested input data on the data lines at time t 1.
At the end of the clock cycle, at time t2, the master strobes the data on the data
lines into its input buffer. In this context, strobe means to capture the values of the

16

Figure 7 Timing of an input transfer on a synchronous bus.


Time

Bus clock
Address
and
command
Data

t1
t0

t2

Bus cycle
Data of a given instant and store them into a buffer. For data to be loaded correctly into
any storage device, such as a register built with flip-flops, the data must be available at
the input of that device for a period greater than the setup time of the device. Hence, the
period t2 - t1 must be greater than the maximum propagation time on the bus plus the
setup time of the input buffer register of the master.
A similar procedure is followed for an output operation. The master places the
output data on the data lines when it transmits the address and command information at
time t2, the addressed device strobes the data lines and loads the data into its data buffer.
The timing diagram in figure 7 is an idealized representation of the actions that
take place on the bus lines. The exact times at which signals actually change state are
somewhat different from those shown because of propagation delays on bus wires and in
the circuits of the devices. Figure 4.24 gives a more realistic picture of what happens in
practice. It shows two views of each signal, except the clock. Because signals take time to
travel from one device to another, a given signal transition is seen by different devices at
different times. One view shows the signal as seen by the master and the other as seen by
the slave.
The master sends the address and command signals on the rising edge at the
beginning of clock period 1 (t0). However, these signals do not actually appear on the bus
until fAM, largely due to the delay in the bus driver circuit. A while later, at t AS, the
signals reach the slave. The slave decodes the address and at t1 sends the requested data.
Here again, the data signals do not appear on the bus until t DS. They travel toward the
master and arrive at tDM. At t2, the master loads the data into its input buffer. Hence the
period t2-tDM is the setup time for the masters input buffer. The data must continue to be
valid after t2 for a period equal to the hold time of that buffer.

17

Figure 8 A detailed timing diagram for the input transfer of figure 7


Time

Bus clock

Seen by master

tAM

Address
and
command
Data

tDM

Seen by slave
Address
and
command

tAS

Data

tDS
t0

t1

t2

Multiple-Cycle transfers:The scheme described above results in a simple design for the device interface,
however, it has some limitations. Because a transfer has to be completed within one clock
cycle, the clock period, t2-t0, must be chosen to accommodate the longest delays on the
bus and the lowest device interface. This forces all devices to operate at the speed of the
slowest device.
Also, the processor has no way of determining whether the addressed device has
actually responded. It simply assumes that, at t2, the output data have been received by
the I/O device or the input data are available on the data lines. If, because of a
malfunction, the device does not respond, the error will not be detected.
To overcome these limitations, most buses incorporate control signals that
represent a response from the device. These signals inform the master that the slave has
recognized its address and that it is ready to participate in a data-transfer operation. They
also make it possible to adjust the duration of the data-transfer period to suit the needs of
the participating devices. To simplify this process, a high-frequency clock signal is used

18

such that a complete data transfer cycle would span several clock cycles. Then, the
number of clock cycles involved can vary from one device to another.
An example of this approach is shown in figure 4.25. during clock cycle 1, the
master sends address and command information on the bus, requesting a read operation.
The slave receives this information and decodes it. On the following active edge of the
clock, that is, at the beginning of clock cycle 2, it makes a decision to respond and begins
to access the requested data. We have assumed that some delay is involved in getting the
data, and hence the slave cannot respond immediately. The data become ready and are
placed on the bus in clock cycle 3. At the same time, the slave asserts a control signal
called Slave-ready.
The Slave-ready signal is an acknowledgment from the slave to the master,
confirming that valid data have been sent. In the example in figure 9, the slave responds
in cycle 3. Another device may respond sooner or later. The Slave-ready signal allows the
duration of a bus transfer to change from one device to another. If the addressed device
does not respond at all, the master waits for some predefined maximum number of clock
cycles, then aborts the operation. This could be the result of an incorrect address or a
device malfunction.
Time

Clock
Address

Command

Data

Slave-ready
Figure 9 An input transfer using multiple clock cycles.

19

ASYNCHRONOUS BUS:An alternative scheme for controlling data transfers on the bus is based on the use
of a handshake between the master and the salve. The concept of a handshake is a
generalization of the idea of the Slave-ready signal in figure 10. The common clock is
replaced by two timing control lines, Master-ready and Slave-ready. The first is asserted
by the master to indicate that it is ready for a transaction, and the second is a response
from the slave.
In principle, a data transfer controlled by a handshake protocol proceeds as
follows. The master places the address and command information on the bus. Then it
indicates to all devices that it has done so by activating the Master-ready line. This causes
all devices on the bus to decode the address. The selected slave performs the required
operation and informs the processor it has done so by activating the Slave-ready line. The
master waits for Slave-ready to become asserted before it removes its signals from the
bus. In the case of a read operation, it also strobes the data into its input buffer.

Figure 10 Handshake control of data transfer during an input operation.

Address
And command

Master-ready

Slave-ready

Data

t0

t1

t2

t3

t4

t5

Bus cycle

20

An example of the timing of an input data transfer using the handshake scheme is
given in figure 4.26, which depicts the following sequence of events.
t0 The master places the address and command information on the bus, and all devices
on the bus begin to decode this information.
t1 The master sets the Master-ready line to 1 to inform the I/O devices that the address
and command information is ready. The delay t1-t0 is intended to allow for any skew that
may occur o the bus. Skew occurs when two signals simultaneously transmitted from one
source arrive at the destination at different times. This happens because different lines of
the bus may have different propagation speeds. Thus, to guarantee that the Master-ready
signal does not arrive at any device ahead of the address and command information, the
delay t1-t0 should be larger than the maximum possible bus skew.
t2 The selected slave, having decoded the address and command information performs
the required input operation by placing the data from its data register on the data lines.
t3 The Slave-ready signal arrives at the master, indicating that the input data are
available on the bus.
t4 The master removes the address and command information from the bus. The delay
between t3 and t4 is again intended to allow for bus skew.
t5 When the device interface receives the 1 to 0 transition of the Master-ready signal, it
removes the data and the Slave-ready signal from the bus. This completes the input
transfer.

Time
Address
and command

Data

Master-ready

Slave-ready

t0

t1

t2

t3

t4 t5

21

PARALLEL PORT:The hardware components needed for connecting a keyboard to a processor. A


typical keyboard consists of mechanical switches that are normally open. When a key is
pressed, its switch closes and establishes a path for an electrical signal. This signal is
detected by an encoder circuit that generates the ASCII code for the corresponding
character.
Figure 11 Keyboard to processor connection.
Data

Data
DATAIN

Address
Processor

R/ W

Master-ready
Slave-ready

Encoder
and
Debouncing

circuit

SIN

Input
interface

Keyboard
switches

Valid

The output of the encoder consists of the bits that represent the encoded character
and one control signal called Valid, which indicates that a key is being pressed. This
information is sent to the interface circuit, which contains a data register, DATAIN, and a
status flag, SIN. When a key is pressed, the Valid signal changes from 0 to 1, causing the
ASCII code to be loaded into DATAIN and SIN to be set to 1. The status flag SIN is
cleared to 0 when the processor reads the contents of the DATAIN register. The interface
circuit is connected to an asynchronous bus on which transfers are controlled using the
handshake signals Master-ready and Slave-ready, as indicated in figure 11. The third
control line, R/ W distinguishes read and write transfers.
Figure 12 shows a suitable circuit for an input interface. The output lines of the
DATAIN register are connected to the data lines of the bus by means of three-state
drivers, which are turned on when the processor issues a read instruction with the address
that selects this register. The SIN signal is generated by a status flag circuit. This signal is
also sent to the bus through a three-state driver. It is connected to bit D0, which means it
will appear as bit 0 of the status register. Other bits of this register do not contain valid
information. An address decoder is used to select the input interface when the high-order
31 bits of an address correspond to any of the addresses assigned to this interface.
Address bit A0 determines whether the status or the data registers is to be read when the
Master-ready signal is active. The control handshake is accomplished by activating the
Slave-ready signal when either Read-status or Read-data is equal to 1.
22

Fig 12 Input interface circuit


Data

Data
DATAOUT

Address
Valid
Processor

R/ W

SOUT

Master-ready
Slave-ready

Printer

Idle
Output
interface

Fig 13 Printer to processor connection


Let us now consider an output interface that can be used to connect an output
device, such as a printer, to a processor, as shown in figure 13. The printer operates under
control of the handshake signals Valid and Idle in a manner similar to the handshake used
on the bus with the Master-ready and Slave-ready signals. When it is ready to accept a
character, the printer asserts its Idle signal. The interface circuit can then place a new
character on the data lines and activate the Valid signal. In response, the printer starts
printing the new character and negates the Idle signal, which in turn causes the interface
to deactivate the Valid signal.

23

The circuit in figure 16 has separate input and output data lines for connection to
an I/O device. A more flexible parallel port is created if the data lines to I/O devices are
bidirectional. Figure 17 shows a general-purpose parallel interface circuit that can be
configured in a variety of ways. Data lines P7 through P0 can be used for either input or
output purposes. For increased flexibility, the circuit makes it possible for some lines to
serve as inputs and some lines to serve as outputs, under program control. The
DATAOUT register is connected to these lines via three-state drivers that are controlled
by a data direction register, DDR. The processor can write any 8-bit pattern into DDR.
For a given bit, if the DDR value is 1, the corresponding data line acts as an output line;
otherwise, it acts as an input line.

Fig14 Output interface circuit

24

Fig 15 combined input/output interface circuit


D7

P7
DATAIN

D0

P0

DATAOUT

Data
Direction
Register

25

Fig 16 A general 8-bit interface


My-address
RS2
RS1
RS0
R/W
Ready

C1
Register
select

Status
and
Control
C2

Accept

INTR
Fig 16 A general 8-bit parallel interface

Fig 17 A parallel port interface for the bus

Fig 18 State diagram for the timing logic.

26

Time
1

Clock

Address
R/ W

Data

Go

Slave-ready
Figure19 Timing for output interface

SERIAL PORT:A Serial port is used to connect the processor to I/O devices that require
transmission of data one bit at a time. The key feature of an interface circuit for a serial
port is that it is capable of communicating in a bit-serial fashion on the device side and in
a bit-parallel fashion on the bus side. The transformation between the parallel and serial
formats is achieved with shift registers that have parallel access capability. A block
diagram of a typical serial interface is shown in figure 20. It includes the familiar
DATAIN and DATAOUT registers. The input shift register accepts bit-serial input from
the I/O device. When all 8 bits of data have been received, the contents of this shift
register are loaded in parallel into the DATAIN register. Similarly, output data in the
DATAOUT register are loaded into the output register, from which the bits are shifted
out and sent to the I/O device.

27

Figure 20 A serial interface


Input shift register

Serial
input

DATAIN

D7
D0
DATAOUT
My-address
RS1
RS0
R/ W

Chip and
Register select

Output shift register

Serial
output

Ready
Accept

Receiving clock

INTR

Status
and
Control

Transmission clock

The double buffering used in the input and output paths are important. A simpler
interface could be implemented by turning DATAIN and DATAOUT into shift registers
and eliminating the shift registers in figure 4.37. However, this would impose awkward
restrictions on the operation of the I/O device; after receiving one character from the
serial line, the device cannot start receiving the next character until the processor reads
the contents of DATAIN. Thus, a pause would be needed between two characters to
28

allow the processor to read the input data. With the double buffer, the transfer of the
second character can begin as soon as the first character is loaded from the shift register
into the DATAIN register. Thus, provided the processor reads the contents of DATAIN
before the serial transfer of the second character is completed, the interface can receive a
continuous stream of serial data. An analogous situation occurs in the output path of the
interface.
Because serial interfaces play a vital role in connecting I/O devices,
several widely used standards have been developed. A standard circuit that includes the
features of our example in figure 20. is known as a Universal Asynchronous Receiver
Transmitter (UART). It is intended for use with low-speed serial devices. Data
transmission is performed using the asynchronous start-stop format. To facilitate
connection to communication links, a popular standard known as RS-232-C was
developed.
STANDARD I/O INTERFACES:The processor bus is the bus defied by the signals on the processor chip
itself. Devices that require a very high-speed connection to the processor, such as the
main memory, may be connected directly to this bus. For electrical reasons, only a few
devices can be connected in this manner. The motherboard usually provides another bus
that can support more devices. The two buses are interconnected by a circuit, which we
will call a bridge, that translates the signals and protocols of one bus into those of the
other. Devices connected to the expansion bus appear to the processor as if they were
connected directly to the processors own bus. The only difference is that the bridge
circuit introduces a small delay in data transfers between the processor and those devices.
It is not possible to define a uniform standard for the processor bus. The structure
of this bus is closely tied to the architecture of the processor. It is also dependent on the
electrical characteristics of the processor chip, such as its clock speed. The expansion bus
is not subject to these limitations, and therefore it can use a standardized signaling
scheme. A number of standards have been developed. Some have evolved by default,
when a particular design became commercially successful. For example, IBM developed
a bus they called ISA (Industry Standard Architecture) for their personal computer known
at the time as PC AT.
Some standards have been developed through industrial cooperative efforts, even
among competing companies driven by their common self-interest in having compatible
products. In some cases, organizations such as the IEEE (Institute of Electrical and
Electronics Engineers), ANSI (American National Standards Institute), or international
bodies such as ISO (International Standards Organization) have blessed these standards
and given them an official status.
A given computer may use more than one bus standards. A typical Pentium
computer has both a PCI bus and an ISA bus, thus providing the user with a wide range
of devices to choose from.

29

Figure 21 An example of a computer system using different interface standards

Main
Memory

Processor

Processor bus
Bridge
PCI bus

Additional
memory

SCSI
controller

Ethernet
interface

SCSI bus

Disk
controller

Disk1

Disk 2

USB
controller

ISA
interface

IDE
disk

Video

CD-ROM
controller

CD-ROM

Keyboard

Game

Peripheral Component Interconnect (PCI) Bus:The PCI bus is a good example of a system bus that grew out of the need for
standardization. It supports the functions found on a processor bus bit in a standardized
format that is independent of any particular processor. Devices connected to the PCI bus
appear to the processor as if they were connected directly to the processor bus. They are
assigned addresses in the memory address space of the processor.
The PCI follows a sequence of bus standards that were used primarily in IBM
PCs. Early PCs used the 8-bit XT bus, whose signals closely mimicked those of Intels
30

80x86 processors. Later, the 16-bit bus used on the PC At computers became known as
the ISA bus. Its extended 32-bit version is known as the EISA bus. Other buses
developed in the eighties with similar capabilities are the Microchannel used in IBM PCs
and the NuBus used in Macintosh computers.
The PCI was developed as a low-cost bus that is truly processor independent. Its
design anticipated a rapidly growing demand for bus bandwidth to support high-speed
disks and graphic and video devices, as well as the specialized needs of multiprocessor
systems. As a result, the PCI is still popular as an industry standard almost a decade after
it was first introduced in 1992.
An important feature that the PCI pioneered is a plug-and-play capability for
connecting I/O devices. To connect a new device, the user simply connects the device
interface board to the bus. The software takes care of the rest.
Data Transfer:In todays computers, most memory transfers involve a burst of data rather than
just one word. The reason is that modern processors include a cache memory. Data are
transferred between the cache and the main memory in burst of several words each. The
words involved in such a transfer are stored at successive memory locations. When the
processor (actually the cache controller) specifies an address and requests a read
operation from the main memory, the memory responds by sending a sequence of data
words starting at that address. Similarly, during a write operation, the processor sends a
memory address followed by a sequence of data words, to be written in successive
memory locations starting at the address. The PCI is designed primarily to support this
mode of operation. A read or write operation involving a single word is simply treated as
a burst of length one.
The bus supports three independent address spaces: memory, I/O, and
configuration. The first two are self-explanatory. The I/O address space is intended for
use with processors, such as Pentium, that have a separate I/O address space. However, as
noted , the system designer may choose to use memory-mapped I/O even when a separate
I/O address space is available. In fact, this is the approach recommended by the PCI its
plug-and-play capability. A 4-bit command that accompanies the address identifies which
of the three spaces is being used in a given data transfer operation.
The signaling convention on the PCI bus is similar to the one used, we assumed
that the master maintains the address information on the bus until data transfer is
completed. But, this is not necessary. The address is needed only long enough for the
slave to be selected. The slave can store the address in its internal buffer. Thus, the
address is needed on the bus for one clock cycle only, freeing the address lines to be used
for sending data in subsequent clock cycles. The result is a significant cost reduction
because the number of wires on a bus is an important cost factor. This approach in used
in the PCI bus.

31

Figure 22 Use of a PCI bus in a computer system.

Host

Main
memory

PCI bridge

PCI bus

Disk

Printer

Ethernet
interface

At any given time, one device is the bus master. It has the right to initiate data
transfers by issuing read and write commands. A master is called an initiator in PCI
terminology. This is either a processor or a DMA controller. The addressed device that
responds to read and write commands is called a target.

Device Configuration:When an I/O device is connected to a computer, several actions are needed to
configure both the device and the software that communicates with it.
The PCI simplifies this process by incorporating in each I/O device interface a
small configuration ROM memory that stores information about that device. The
configuration ROMs of all devices is accessible in the configuration address space. The
PCI initialization software reads these ROMs whenever the system is powered up or
reset. In each case, it determines whether the device is a printer, a keyboard, an Ethernet
interface, or a disk controller. It can further learn bout various device options and
characteristics.
Devices are assigned addresses during the initialization process. This means that
during the bus configuration operation, devices cannot be accessed based on their
address, as they have not yet been assigned one. Hence, the configuration address space
uses a different mechanism. Each device has an input signal called Initialization Device
Select, IDSEL#.

32

The PCI bus has gained great popularity in the PC word. It is also used in many
other computers, such as SUNs, to benefit from the wide range of I/O devices for which a
PCI interface is available. In the case of some processors, such as the Compaq Alpha, the
PCI-processor bridge circuit is built on the processor chip itself, further simplifying
system design and packaging.
SCSI Bus:The acronym SCSI stands for Small Computer System Interface. It refers to a
standard bus defined by the American National Standards Institute (ANSI) under the
designation X3.131 . In the original specifications of the standard, devices such as disks
are connected to a computer via a 50-wire cable, which can be up to 25 meters in length
and can transfer data at rates up to 5 megabytes/s.
The SCSI bus standard has undergone many revisions, and its data transfer
capability has increased very rapidly, almost doubling every two years. SCSI-2 and
SCSI-3 have been defined, and each has several options. A SCSI bus may have eight data
lines, in which case it is called a narrow bus and transfers data one byte at a time.
Alternatively, a wide SCSI bus has 16 data lines and transfers data 16 bits at a time.
There are also several options for the electrical signaling scheme used.
Devices connected to the SCSI bus are not part of the address space of the
processor in the same way as devices connected to the processor bus. The SCSI bus is
connected to the processor bus through a SCSI controller. This controller uses DMA to
transfer data packets from the main memory to the device, or vice versa. A packet may
contain a block of data, commands from the processor to the device, or status information
about the device.
To illustrate the operation of the SCSI bus, let us consider how it may be used
with a disk drive. Communication with a disk drive differs substantially from
communication with the main memory.
A controller connected to a SCSI bus is one of two types an initiator or a target.
An initiator has the ability to select a particular target and to send commands specifying
the operations to be performed. Clearly, the controller on the processor side, such as the
SCSI controller, must be able to operate as an initiator. The disk controller operates as a
target. It carries out the commands it receives from the initiator. The initiator establishes
a logical connection with the intended target. Once this connection has been established,
it can be suspended and restored as needed to transfer commands and bursts of data.
While a particular connection is suspended, other device can use the bus to transfer
information. This ability to overlap data transfer requests is one of the key features of the
SCSI bus that leads to its high performance.
Data transfers on the SCSI bus are always controlled by the target controller. To
send a command to a target, an initiator requests control of the bus and, after winning
arbitration, selects the controller it wants to communicate with and hands control of the
33

bus over to it. Then the controller starts a data transfer operation to receive a command
from the initiator.
The processor sends a command to the SCSI controller, which causes the
following sequence of event to take place:
1. The SCSI controller, acting as an initiator, contends for control of the bus.
2. When the initiator wins the arbitration process, it selects the target controller and
hands over control of the bus to it.
3. The target starts an output operation (from initiator to target); in response to this,
the initiator sends a command specifying the required read operation.
4. The target, realizing that it first needs to perform a disk seek operation, sends a
message to the initiator indicating that it will temporarily suspend the connection
between them. Then it releases the bus.
5. The target controller sends a command to the disk drive to move the read head to
the first sector involved in the requested read operation. Then, it reads the data
stored in that sector and stores them in a data buffer. When it is ready to begin
transferring data to the initiator, the target requests control of the bus. After it
wins arbitration, it reselects the initiator controller, thus restoring the suspended
connection.
6. The target transfers the contents of the data buffer to the initiator and then
suspends the connection again. Data are transferred either 8 or 16 bits in parallel,
depending on the width of the bus.
7. The target controller sends a command to the disk drive to perform another seek
operation. Then, it transfers the contents of the second disk sector to the initiator
as before. At the end of this transfers, the logical connection between the two
controllers is terminated.
8. As the initiator controller receives the data, it stores them into the main memory
using the DMA approach.
9. The SCSI controller sends as interrupt to the processor to inform it that the
requested operation has been completed
This scenario show that the messages exchanged over the SCSI bus are at a higher
level than those exchanged over the processor bus. In this context, a higher level means
that the messages refer to operations that may require several steps to complete,
depending on the device. Neither the processor nor the SCSI controller need be aware of
the details of operation of the particular device involved in a data transfer. In the
preceding example, the processor need not be involved in the disk seek operation.
UNIVERSAL SERIAL BUS (USB):The synergy between computers and communication is at the heart of todays
information technology revolution. A modern computer system is likely to involve a wide
variety of devices such as keyboards, microphones, cameras, speakers, and display
devices. Most computers also have a wired or wireless connection to the Internet. A key
requirement is such an environment is the availability of a simple, low-cost mechanism to
34

connect these devices to the computer, and an important recent development in this
regard is the introduction of the Universal Serial Bus (USB). This is an industry standard
developed through a collaborative effort of several computer and communication
companies, including Compaq, Hewlett-Packard, Intel, Lucent, Microsoft, Nortel
Networks, and Philips.
The USB supports two speeds of operation, called low-speed (1.5 megabits/s)
and full-speed (12 megabits/s). The most recent revision of the bus specification (USB
2.0) introduced a third speed of operation, called high-speed (480 megabits/s). The USB
is quickly gaining acceptance in the market place, and with the addition of the high-speed
capability it may well become the interconnection method of choice for most computer
devices.
The USB has been designed to meet several key objectives:
Provides a simple, low-cost and easy to use interconnection system that
overcomes the difficulties due to the limited number of I/O ports available on a
computer.
Accommodate a wide range of data transfer characteristics for I/O devices,
including telephone and Internet connections.
Enhance user convenience through a plug-and-play mode of operation
Port Limitation:The parallel and serial ports described in previous section provide a generalpurpose point of connection through which a variety of low-to medium-speed devices can
be connected to a computer. For practical reasons, only a few such ports are provided in a
typical computer.
Device Characteristics:The kinds of devices that may be connected to a computer cover a wide range of
functionality. The speed, volume, and timing constraints associated with data transfers to
and from such devices vary significantly.

A variety of simple devices that may be attached to a computer generate data of a


similar nature low speed and asynchronous. Computer mice and the controls and
manipulators used with video games are good examples.
Plug-and-Play:As computers become part of everyday life, their existence should become
increasingly transparent. For example, when operating a home theater system, which
includes at least one computer, the user should not find it necessary to turn the computer
off or to restart the system to connect or disconnect a device.

35

The plug-and-play feature means that a new device, such as an additional


speaker, can be connected at any time while the system is operating. The system should
detect the existence of this new device automatically, identify the appropriate devicedriver software and any other facilities needed to service that device, and establish the
appropriate addresses and logical connections to enable them to communicate.
The plug-and-play requirement has many implications at all levels in the system,
from the hardware to the operating system and the applications software. One of the
primary objectives of the design of the USB has been to provide a plug-and-play
capability.
USB Architecture:The discussion above points to the need for an interconnection system that
combines low cost, flexibility, and high data-transfer bandwidth. Also, I/O devices may
be located at some distance from the computer to which they are connected. The
requirement for high bandwidth would normally suggest a wide bus that carries 8, 16, or
more bits in parallel. However, a large number of wires increases cost and complexity
and is inconvenient to the user. Also, it is difficult to design a wide bus that carries data
for a long distance because of the data skew problem discussed. The amount of skew
increases with distance and limits the data that can be used.
A serial transmission format has been chosen for the USB because a serial bus
satisfies the low-cost and flexibility requirements. Clock and data information are
encoded together and transmitted as a single signal. Hence, there are no limitations on
clock frequency or distance arising from data skew. Therefore, it is possible to provide a
high data transfer bandwidth by using a high clock frequency. As pointed out earlier, the
USB offers three bit rates, ranging from 1.5 to 480 megabits/s, to suit the needs of
different I/O devices.
Figure 23 Universal Serial Bus tree structure.
Host computer
Root
hub

Hub

Hub

I/O
device

I/O
device

I/O
device

I/O
device

Hub

I/O
device

I/O
device

36

To accommodate a large number of devices that can be added or removed at any


time, the USB has the tree structure shown in figure 23. Each node of the tree has a
device called a hub, which acts as an intermediate control point between the host and the
I/O devices. At the root of the tree, a root hub connects the entire tree to the host
computer. The leaves of the tree are the I/O devices being served (for example, keyboard,
Internet connection, speaker, or digital TV), which are called functions in USB
terminology. For consistency with the rest of the discussion in the book, we will refer to
these devices as I/O devices.
The tree structure enables many devices to be connected while using only simple
point-to-point serial links. Each hub has a number of ports where devices may be
connected, including other hubs. In normal operation, a hub copies a message that it
receives from its upstream connection to all its downstream ports. As a result, a message
sent by the host computer is broadcast to all I/O devices, but only the addressed device
will respond to that message. In this respect, the USB functions in the same way as the
bus in figure 4.1. However, unlike the bus in figure 4.1, a message from an I/O device is
sent only upstream towards the root of the tree and is not seen by other devices. Hence,
the USB enables the host to communicate with the I/O devices, but it does not enable
these devices to communicate with each other.
Note how the tree structure helps meet the USBs design objectives. The tree
makes it possible to connect a large number of devices to a computer through a few ports
(the root hub). At the same time, each I/O device is connected through a serial point-topoint connection. This is an important consideration in facilitating the plug-and-play
feature, as we will see shortly.
The USB operates strictly on the basis of polling. A device may send a message
only in response to a poll message from the host. Hence, upstream messages do not
encounter conflicts or interfere with each other, as no two devices can send messages at
the same time. This restriction allows hubs to be simple, low-cost devices.
The mode of operation described above is observed for all devices operating at
either low speed or full speed. However, one exception has been necessitated by the
introduction of high-speed operation in USB version 2.0. Consider the situation in figure
24. Hub A is connected to the root hub by a high-speed link. This hub serves one highspeed device, C, and one low-speed device, D. Normally, a messages to device D would
be sent at low speed from the root hub. At 1.5 megabits/s, even a short message takes
several tens of microsends. For the duration of this message, no other data transfers can
take place, thus reducing the effectiveness of the high-speed links and introducing
unacceptable delays for high-speed devices. To mitigate this problem, the USB protocol
requires that a message transmitted on a high-speed link is always transmitted at high
speed, even when the ultimate receiver is a low-speed device. Hence, a message at low
speed to device D. The latter transfer will take a long time, during which high-speed
traffic to other nodes is allowed to continue. For example, the root hub may exchange
several message with device C while the low-speed message is being sent from hub A to
device D. During this period, the bus is said to be split between high-speed and low37

speed traffic. The message to device D is preceded and followed by special commands to
hub A to start and end the split-traffic mode of operation, respectively.
Figure 24 Split bus operation

Host computer

HS

Root
Hub
HS

Hub

Hub
B

A
HS
Device
C

F/LS
Device
D

HS High speed
F/LS Full/Low speed

The USB standard specifies the hardware details of USB interconnections as well
as the organization and requirements of the host software. The purpose of the USB
software is to provide bidirectional communication links between application software
and I/O devices. These links are called pipes. Any data entering at one end of a pipe is
delivered at the other end. Issues such as addressing, timing, or error detection and
recovery are handled by the USB protocols.
Addressing:In earlier discussions of input and output operations, we explained that I/O
devices are normally identified by assigning them a unique memory address. In fact, a
device usually has several addressable locations to enable the software to send and
receive control and status information and to transfer data.
When a USB is connected to a host computer, its root hub is attached to the
processor bus, where it appears as a single device. The host software communicates with
individual devices attached to the USB by sending packets of informations, which the
root hub forwards to the appropriate device in the USB tree.
Each device on the USB, whether it is a hub or an I/O device, is assigned a 7-bit
address. This address is local to the USB tree and is not related in any way to the
addresses used on the processor bus. A hub may have any number of devices or other
hubs connected to it, and addresses are assigned arbitrarily. When a device is first
connected to a hub, or when it is powered on, it has the address 0. The hardware of the
38

hub to which this device is connected is capable of detecting that the device has been
connected, and it records this fact as part of its own status information. Periodically, the
host polls each hub to collect status information and learn about new devices that may
have been added or disconnected. When the host is informed that a new device has been
connected, it uses a sequence of commands to send a reset signal on the corresponding
hub port, read information from the device about its capabilities, send configuration
information to the device, and assign the device a unique USB address. Once this
sequence is completed the device begins normal operation and responds only to the new
address.
When a device is powered off, a similar procedure is followed. The
corresponding hub reports this fact to the USB system software, which in turn updates its
tables. Of course, if the device that has been disconnected is itself a hub, all devices
connected through that hub must also be recorded as disconnected. The USB software
must maintain a complete picture of the bus topology and the connected devices at all
times.
USB Protocols:All information transferred over the USB is organized in packets, where a packet
consists of one or more bytes of information. There are many types of packets that
perform a variety of control functions. We illustrate the operation of the USB by giving a
few examples of the key packet types and show how they are used.
The information transferred on the USB can be divided into two broad
categories: control and data. Control packets perform such tasks as addressing a device to
initiate data transfer, acknowledging that data have been received correctly, or indicating
an error. Data packets carry information that is delivered to a device. For example, input
and output data are transferred inside data packets.
A packet consists of one or more fields containing different kinds of information.
The first field of any packet is called the packet identifier, PID, which identifies the type
of that packet. There are four bits of information in this field, but they are transmitted
twice. The first time they are sent with their true values, and the second time with each
bit complemented, as shown in figure 4.45a. This enables the receiving device to verify
that the PID byte has been received correctly.
The four PID bits identify one of 16 different packet types. Some control packets,
such as ACK (Acknowledge), consist only of the PID byte. Control packets used for
controlling data transfer operations are called toke packets. They have the format shown
in figure 4.45b. A token packet starts with the PID field, using one of two PID values to
distinguish between an IN packet and an OUT packet, which control input and output
transfers, respectively. The PID field is followed by the 7-bit address of a device and the
4-bit endpoint number within that device. The packet ends with 5 bits for error checking,
using a method called cyclic redundancy check (CRC). The CRC bits are computed based
on the contents of the address and endpoint fields. By performing an inverse

39

computation, the receiving device can determine whether the packet has been received
correctly.
Data packets, which carry input and output data, have the format shown in figure.
The packet identifier field is followed by up to 8192 bits of data, then 16 error-checking
bits. Three different PID patterns are used to identify data packets, so that data packets
may be numbered 0,1, or 2. Note that data packets do not carry a device address or an
endpoint number. This information is included in the IN or OUT token packet that
initiates the transfer.

Successive data packets on a full-speed or low-speed pipe carry the numbers 0


and 1, alternately. This simplifies recovery from transmission errors. If a token, data, or
acknowledgement packet is lost as a result of a transmission error, the sender resends the
entire sequence. By checking the data packet number in the PID field, the receiver can
detect and discard duplicate packets. High-speed data packets are sequentially numbered
0, 1, 2, 0, and so on.

PID0

PID1

PID2

PID3

PID0

PID1

PID2

A) packet identifier Field

PID

ADDR

ENDP

CRC16

B) Token Packet ,IN or OUT


8

0 TO 8192

16

DATA

CRC16

PID
C) Data Packet

Figure 24 USB Packet formats

40

PID3

Host

Hub

I/O Device

Token
Data0
Time

ACK

Token
Data0

ACK

Token
Data1

ACK
Token
Data1

ACK

Fig 27 An output transfer

41

Input operations follow a similar procedure. The host sends a token packet of
type IN containing the device address. In effect, this packet is a poll asking the device to
send any input data it may have. The device responds by sending a data packet followed
by an ACK. If it has no data ready, it responds by sending a negative acknowledgement
(NAK) instead.
We pointed out that a bus that has a mix of full/low-speed links and high-speed
links uses the split-traffic mode of operation in order not to delay messages on high-speed
links. In such cases, an IN or an OUT packet intended for a full-or-low-speed device is
preceded by a special control packet that starts the split-traffic mode.
Isochronous Traffic on USB:One of the key objectives of the USB is to support the transfer of isochronous
data, such as sampled voice, in a simple manner. Devices that generates or receives
isochronous data require a time reference to control the sampling process. To provide this
reference. Transmission over the USB is divided into frames of equal length. A frame is
1ms long for low-and full-speed data. The root hub generates a Start of Frame control
packet (SOF) precisely once every 1 ms to mark the beginning of a new frame.
The arrival of an SOF packet at any device constitutes a regular clock signal that
the device can use for its own purposes. To assist devices that may need longer periods of
time, the SOF packet carries an 11-bit frame number, as shown. Following each SOF
packet, the host carries out input and output transfers for isochronous devices. This
means that each device will have an opportunity for an input or output transfer once
every 1 ms.
Electrical Characteristics:The cables used for USB connections consist of four wires. Two are used to carry
power, +5V and Ground. Thus, a hub or an I/O device may be powered directly from the
bus, or it may have its own external power connection. The other two wires are used to
carry data. Different signaling schemes are used for different speeds of transmission. At
low speed, 1s and 0s are transmitted by sending a high voltage state (5V) on one or the
other o the two signal wires. For high-speed links, differential transmission is used.

************************

42

You might also like