Operations: Years John Von Neumann Shows Main in From Bus

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Fifty years ago John von Neumann

proposed the concept of the stored-program computer in which both instructions and data are confined within the
boundaries of a stoi-age device called
memory. Today a computer architect is
faced with a bewildering variety of memory-types to choose from when implementing this concept in the hardware.
The most commonly used storage
device is called random-access memory
{RAM}.That is, the process of locating a
word within the storage array involves
giving its address. The time needed to
retrieve the word remains the same irrespective of the physical location of the
word in the array.
However, many data-processing
applications require searching items in
some data structure, such as a table,
stored in the memory. The established
procedure to search a table is: 1) to store
all the items where they can be accessed
in sequence; 2) choose a sequence of
addresses; 3 ) read the contents of memory at each address; and 4) compare the
information read with the item being
searched until a match occurs.
This is exorbitantly time-consuming if
the table is very large and/or the search
algorithm is relatively inefficient. This
time can be significantly reduced if the
stored data can be identified for access
by the content of the data itself rather
than by an address.
A memory unit accessed by content is
called an associative memory or contentaddressable memory (CAM). This type
of memory is accessed simultaneously
and in parallel on the basis of the data
content.

What is RAM?
Random-access memories are characterized by the fact that every location can
be accessed independently. Also, the
access time is constant for each word in

26

the memory. Figure 1 shows the main


components of a random-access memory. The storage array is composed of n
words; each word consists of cells; and
each cell can store one bit of i d o m d o n .

RAM operations
The two operations that a randomaccess memory can perform are the read
and write operations. The read signal

a stored word out of the memory (read)


are as follows:
1. Present the binary address of the
desired word (available on address bus)
to the memory address register.
2. Activate the Readwrite Control
for read operation.
The RAM will then take the bits from
the word selected through the address and
make them available in the data register
read. The contents of the selected word do
not change after reading.
The steps that must be taken for the
purpose of transferring a new word to
be stored in the memory (write) are as
follows:
1. Transfer the binary address of the
word to be written from the address bus
to the memory address register.
2. Transfer the data bits that must be

0278-6648/97/$10.00 01997 IEEE

stored in memory from the data bus to


the data register wnte.
3 Activate the Readwrite Control
for write operation.
The memory unit will then take tlie
bits from the input data lines and store
them in the word specified by the
address lines.
Since it is usually not desirableto permit simultaneous reading and writing,

of the data bus may then be merged to


form a single bidirectional data bus.

RAM types
Random-access memories can be
broadly classified into static and dynamic
memories based on their physical characteristics. In static random-access memories (SRAMs), the individual words,
once written, do not need to be further
addressed or manipulated to hold their
values. These devices are composed of
flip-flops that use a small current to
maintain their logic level. They are used
mostly for the CPU registers and other
high-speed storage devices such as
cache. In dynamic random-access memories (DRAMS),refresh circuits are used
to maintain the information in the memory over time.

IEEE POTENTIALS

Problemswith RAM
Todays computers rely on RAM.
Every accessed word must travel individually between the processing unit and the
memory through a communication medium, usually a shared bus. The simplicity
of this approach has ensured its success as evident by the
ubiquitous nature of
computers worldwide.
However, there are
some inherent drawbacks to a wordat-a-time, locationaddressed memory.
One major problem
of address-based memory is called the von Neumann bottleneck. That
is, the memory access
path becomes the limiting factor for system
performance and most
of the traffic is sending
information back and
forthjust to calculate the
effective address of the
necessary data word.
A second important
nature of the processing.
Each piece of information must be handled sequentially. This approach is particularly slow in search-and-compare
problems, where many items must be
inspected to determine the outcome. The
search time increases at the same rate as
the list size. The performance penalty
increases if a more complex comparison
is necessary while searching, such as correlating or sorting the data.
Techniques such as hashing and hardware execution pipelines attempt to d e viate the problems by reducing the search
time and overlapping the functions. Nevertheless, improvement using conventional methods is limited.
The serious disadvantages inherent in
random-access memories become more
obvious when multiple processing units
are introduced. Modem parallel-processing architectures, such as dataflow
machines, exploit parallelism to increase
their execution performance. Such parallel systems do not execute efficiently on
sequential memory. (Each data word can
only be accessed serially by its location in
a large sequential array.)

Why CAM?
Consider a table shown in Fig 2 ,
which is to be stored in a computers
memory. It consists of a list of records,

APRIUMAY 1997

each containing three subfields: a persons name, a social security number,


and an age. Say a conventional randomaccess memory is used for storage and
we are presented with a question like
What are A. Smiths social security
number and age? It then will be neces-

sary to specify exactly the physical


address of Smith entry in the table, e.g.,
by the instruction
READ ROW 4
The address ROW 4 has no logical
relationship to Smith; hence, it can be
viewed as an artificial construct which
adds to programming complexity.
An altemative approach would be to
search the entire table using the name
field as an address. The request for the
Smith data would then be in the form of
an instruction such as
READ NAME = A. SMITH
This would involve scanning all
entries in the table sequentially and
comparing their NAME fields to the
given address, A. SMITH, until a match
is found. Sequential searching of this
type is very time-consuming if the table
is very large.
An associative or content-addressable memory eliminates this difficulty.
It simultaneously examines all entries
in the table and selects the one that
matches the given address. Furthermore, we are able to select other fields
of the record also to use as an address.
For example,
READ SSN = 456-78-9012
uses the SSN field as an address and
accesses the Smith entry in the table.

What iS CAM?
CAM is defined as a collection of
storage elements, c d e d associative cells,
which are accessed in parallel on the
basis of data contents rather than by specific address or location. Each associative
cell has the hardware capability to store
and search its contents
against the data broadcasted by the control unit.
It then indicates a match
or mismatch by the state
of a flip-flop.
A typical associative
memory architecture is
shown in Fig. 3. The
essential components
of this associative
memory are:
Comparand register:
It contains the data to be
compared against the
contents of the memory
array.
Mask register: It is
used to mask off portions
of the data word@)which
do not participate in the
operations.
Memory array: provides storage and search
medium for data.
Word select register: It generates signals, based on input address lines, to
select the memory words which participate in the operations.
Responder and contention resolution
logic: Besides indicating success or failure of a search operation by settinghesetting of the flip-flop, it is also responsible
for resolving the contention if more than
one word matches the data in the comparand register. To provide this capability,
the cells of the memory are organized
such that each cell has a predecessor and
a successor. Using additional circuitry to
provide select f i t responder function,
the tag bit of any cell is m e d off whose
predecessor, or the predecessor of whose

Fig. 2 A sumple fable for sbmge

27

Fig. 3 A typical CAM structure

predecessor, or indeed any of whose


ancestors is on. Furthermore, algorithms have been developed to facilitate
ordered retrieval from the memory in
case of multiple responses. Examples
include Frei-Goldbergs Algorithm, Seeber-Lindquists Algorithm, and Lewins
Algorithm.
0 Output register: It acts as an interface
to output the data from the memory array
when a search operation has been successfully accomplished.

gle word in the associative array. This


organization represents the hardware
realization of a simple program loop in
linear search.
Block-oriented In this class, the associative capability is provided at the mass
storage level (e.g., secondary storage).
The fixed readwrite heads are extended
as a small processor and as the data passes under the readwrite heads, it is investigated and marked for later accesses.

CAM types

The basis operations in a CAM are


search, read, and write. To cany out the
search operation, algorithms are written
depending upon the requirement. For
example, to search for the largest number
stored in the CAM, an example of the
search algorithm is given in Fig. 4.
The initial contents of the comparand
and mask registers are shown at the top
of the diagram. The comparand is all
ones to start out with while the mask has
one in the left-most bit (most-significant) position only. By the time we finish, the mask will have all ones and the
comparand will hold a copy of the
largest word in the array.
We begin by setting all the tag bits

Associative memories have been classified into four categories: fully parallel,
bit-serial, word-serial,and block-oriented.
Fully parallel: In a fully parallel organization, the search capability is associated with every bit of the associative word.
Therefore the associative operation can
be performed along two dimensions
simultaneously.
Bit-serial: In this organization, the
memory is organized as a collection of
circular shift registers in which search
capability is associated with a designated
bit within each word.
Word-serial: In this class, the search
capability is associated with just a sin-

28

CAM operations

and then looking for an exact match to


the masked word. This selects all those
words holding a one in their most-significant bit position. If the search is successful (indicated by the state of the flip-flops
in the responder), then it means that the
largest word has a one in its most-significant position. Otherwise, the largest
word must have a zero at this position.
Thus, we change the bit of the comparand we were just examining to zero.
We are not finished yet so we put an
additional one into the mask and go back
to search again for the exact match. This
time we go to the first two bits of the
comparand. As we proceed round the
loop across the word from left to right,
the comparand grows to look more like
the largest word in the array. By the time
we reach the right-hand end of the word,
well have extracted a copy of the desired
word in the comparand register. This
algorithm is bit-serial and word-parallel;
that is, we process all words simultaneously but one bit at a time.
Let n be the number of words in the
memory array and m be the number of
bits per word, then using the algorithm
given in Fig. 4, it will take only m time
units to find the largest number stored in
the CAM, while itll take mxn time units
in the conventional RAM to accomplish
the same task.
To select a particular word for reading, we present its address to the word
select register. We then wait for a few
microseconds until the contents become
available in the output register. Since
there is no address associated with the
word in the CAM, we distinguish one
word from another by the state of the flipflop in the responder corresponding to the
given word.
If we wish to write a one into a particular bit of all the words in the CAM, then
we energize the write line. Upon setting
the responders flip-flops, a one will be
stored at that bit position for every word.
If we do not wish to write into some
word(s), we can disable the entire
word(s) using word select register, and if
we do not want to write into some particular bit of a word then we simply do not
energize the write line for that bit position
in the word. Thus, given appropriate control signals, we can write anywhere within the associative memory array.

Problems with CAM


There have been a number of obstacles to commercially successful associative memories. Some of these are
listed below:

IEEE POTENTIALS

fnnctional and design complexity of


the associative memories,
relatively high cost for reasonable
storage capacity,
poor storage density compared to
conventional memory,
slow access time due to available
methods of implementation,
a lack of software to properly use the
associative power of the memory,
An associative memory is more
expensive to build and has low storage
density than a conventional randomaccess memory. This is because of the
overhead involved in the storage, comparison, manipulation, and output selection logic. Very little software is available
for content-addressablememories. Even
though, programming for an associative
memory is not anymore difficult than for
conventional memory.

CAM applications
Associative processing has a natural
affinity for use in artificial intelligence
(AI) applications, and it has also been
successfully implemented for radar signal

tracking, image processing, address filtering for communication networks, and


arithmetic and logarithmic operations on
large sets of data. The CAM is useful in
speeding up applications which are
search-intensive and involve patternmatching. Examples of such applications
are large-scale database searching and
dictionary retrieval.

Summary
No one kind of memory can be used
to supply all the memory needs within a
computer system. The deciding factors
in choosing a particular kind are usually
the cost and the nature of intended
application.
RAMS are fast, storage-efficient, and
secure. But they also are passive devices,
and require complex and large data structures to be broken down into small
pieces, thus losing intrinsic links. Moreover, RAMS lack the key feature of the
human memory, i.e., the ability to form
associations among memory items.
CAMs, on the other hand, are expensive but active devices. They allow paral-

lelism to be implemented at the basic


level of computer hardware. With the
advancementsin VLSI (Very Large Scale
Integration) technology, it is now possible to fabricate large capacity CAMs for
applications such as database searching
and dictionary retrieval.

Read more about it


John P. Hayes, Computer Architecture and Organization (2nd Edition),
McGraw-Hill, 1988
Caxton C. Foster, Content Address-

able Parallel Processors, Van Nostrand


Reinhold Co., 1976
M. Morris Mano, Computer System
Architecture (3rd Edition), Prentice Hall,
1993
M. Morris Mano, Computer Engineering Hardware Design, Prentice Hall,
1988
C. H. Chen, Computer Engineering
Handbook, McGraw-Hill, 1992
Robert J. Baron and Lee Higbie,
Computer Architecture, Addison-Wesley ,

1992
Lawrence Chisvin and R. James
Duckworth, Content-Addressable and
Associative Memory: Alternatives to the
Ubiquitous RAM,Computer,July 1989,
pp. 51-64
IEEE Computer, special issue on
Associative Processing, November 1994
issue
IEEE Micro, special issues on Associative Memories and Processors, Parts 1
and 2, June and December 1992 issues.

About the author

Fig. 4 An example of a CAM search algorithm

APRIUMAY 1997

Tariq Jamil, the Student Editor of


ZEEE Potentials, is currently an assistant
professor in the Faculty ofComputer Science and Engineering at the Ghulam
Ishaq Khan Institute of Engineering Sciences and Technology, Topi, Pakistan.
He received his Ph.D. and M.S. degrees
in computer engineering from Florida
Institute of Technology, Melbourne,
Florida, in 1996 and 1992 respectively,
and earned his B.Sc. (Hons.) degree in
electrical engineering from the NWFP
University of Engineering and Technology, Peshawar, Pakistan, in 1989. His doctoral research involved incorporation of
content-addressable memories within the
dataflow environment and his dissertation
has been published by UMI, P.O. Box
1346, Ann Arbor, MI 48106-1346, entitled Associative Dataflow Processor
Architecture. He can be contacted by email at address: tjamil @giki.sdnpk.
undp.org

29

You might also like