Harvard Architecture: Memory Details

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 11

Harvard architecture

From Wikipedia, the free encyclopedia

Harvard architecture

The Harvard architecture is a computer architecture with physically separate storage and signal pathways


for instructions and data. The term originated from the Harvard Mark I relay-based computer, which stored
instructions on punched tape (24 bits wide) and data in electro-mechanical counters. These early machines
had limited data storage, entirely contained within the central processing unit, and provided no access to
the instruction storage as data. Programs needed to be loaded by an operator, the processor could
not boot itself.

Today, most processors implement such separate signal pathways for performance reasons but actually
implement a Modified Harvard architecture, so they can support tasks like loading a program from disk
storage as data and then executing it.

Contents
 [hide]

1 Memory details

o 1.1 Contrast with von Neumann

architectures

o 1.2 Contrast with Modified Harvard

architecture

2 Speed

o 2.1 Internal vs. external design

3 Modern uses of the Harvard architecture

4 External links

[edit]Memory details

In a Harvard architecture, there is no need to make the two memories share characteristics. In particular,
the word width, timing, implementation technology, and memory address structure can differ. In some
systems, instructions can be stored in read-only memory while data memory generally requires read-write
memory. In some systems, there is much more instruction memory than data memory so instruction
addresses are wider than data addresses.

[edit]Contrast with von Neumann architectures


Under pure von Neumann architecture the CPU can be either reading an instruction or reading/writing data
from/to the memory. Both cannot occur at the same time since the instructions and data use the same bus
system. In a computer using the Harvard architecture, the CPU can both read an instruction and perform a
data memory access at the same time, even without a cache. A Harvard architecture computer can thus be
faster for a given circuit complexity because instruction fetches and data access do not contend for a single
memory pathway .

Also, a Harvard architecture machine has distinct code and data address spaces: instruction address zero
is not the same as data address zero. Instruction address zero might identify a twenty-four bit value, while
data address zero might indicate an eight bit byte that isn't part of that twenty-four bit value.

[edit]Contrast with Modified Harvard architecture


A modified Harvard architecture machine is very much like a Harvard architecture machine, but it relaxes
the strict separation between instruction and data while still letting the CPU concurrently access two (or
more) memory buses. The most common modification includes separate instruction and
data caches backed by a common address space. While the CPU executes from cache, it acts as a pure
Harvard machine. When accessing backing memory, it acts like a von Neumann machine (where code can
be moved around like data, a powerful technique). This modification is widespread in modern processors
such as the ARM architecture and X86 processors. It is sometimes loosely called a Harvard architecture,
overlooking the fact that it is actually "modified".

Another modification provides a pathway between the instruction memory (such as ROM or flash) and the
CPU to allow words from the instruction memory to be treated as read-only data. This technique is used in
some microcontrollers, including the Atmel AVR. This allows constant data, such as text strings or function
tables, to be accessed without first having to be copied into data memory, preserving scarce (and power-
hungry) data memory for read/write variables. Special machine language instructions are provided to read
data from the instruction memory. (This is distinct from instructions which themselves embed constant data,
although for individual constants the two mechanisms can substitute for each other.)

[edit]Speed

In recent years, the speed of the CPU has grown many times in comparison to the access speed of the
main memory. Care needs to be taken to reduce the number of times main memory is accessed in order to
maintain performance. If, for instance, every instruction run in the CPU requires an access to memory, the
computer gains nothing for increased CPU speed—a problem referred to as being "memory bound".
It is possible to make extremely fast memory but this is only practical for small amounts of memory for cost,
power and signal routing reasons. The solution is to provide a small amount of very fast memory known as
a CPU cache which holds recently accessed data. As long as the memory that the CPU needs is in the
cache, the performance hit is much smaller than it is when the cache has to turn around and get the data
from the main memory.

[edit]Internal vs. external design


Modern high performance CPU chip designs incorporate aspects of both Harvard and von Neumann
architecture. In particular, the Modified Harvard architecture is very common. CPU cache memory is divided
into an instruction cache and a data cache. Harvard architecture is used as the CPU accesses the cache. In
the case of a cache miss, however, the data is retrieved from the main memory, which is not formally
divided into separate instruction and data sections, although it may well have separate memory controllers
used for concurrent access to RAM, ROM and (NOR) flash memory.

Thus, while a von Neumann architecture is visible in some contexts, such as when data and code come
through the same memory controller, the hardware implementation gains the efficiencies of the Harvard
architecture for cache accesses and at least some main memory accesses.

In addition, CPUs often have write buffers which let CPUs proceed after writes to non-cached regions. The
von Neumann nature of memory is then visible when instructions are written as data by the CPU and
software must ensure that the caches (data and instruction) and write buffer are synchronized before trying
to execute those just-written instructions.

[edit]Modern uses of the Harvard architecture

The principal advantage of the pure Harvard architecture—simultaneous access to more than one memory
system—has been reduced by modified Harvard processors using modern CPU cachesystems. Relatively
pure Harvard architecture machines are used mostly in applications where tradeoffs, such as the cost and
power savings from omitting caches, outweigh the programming penalties from having distinct code and
data address spaces.

 Digital signal processors (DSPs) generally execute small, highly-optimized audio or video


processing algorithms. They avoid caches because their behavior must be extremely reproducible. The
difficulties of coping with multiple address spaces are of secondary concern to speed of execution. As a
result, some DSPs have multiple data memories in distinct address spaces to
facilitateSIMD and VLIW processing. Texas Instruments TMS320 C55x processors, as one example,
have multiple parallel data buses (two write, three read) and one instruction bus.

 Microcontrollers are characterized by having small amounts of program (flash memory) and data
(SRAM) memory, with no cache, and take advantage of the Harvard architecture to speed processing
by concurrent instruction and data access. The separate storage means the program and data
memories can have different bit depths, for example using 16-bit wide instructions and 8-bit wide data.
They also mean that instruction prefetch can be performed in parallel with other activities. Examples
include, the AVR by Atmel Corp, the PIC by Microchip Technology, Inc. and the ARM Cortex-M3
processor (not all ARM chips have Harvard architecture).

Even in these cases, it is common to have special instructions to access program memory as data for read-
only tables, or for reprogramming.

Von Neumann architecture


From Wikipedia, the free encyclopedia
It has been suggested that System bus model be merged into this article or
section. (Discuss)

Schematic of the von Neumann architecture. TheControl Unit and Arithmetic Logic Unit form the main components of
the Central Processing Unit (CPU)

The von Neumann architecture is a design model for a stored-program digital computer that uses


a central processing unit (CPU) and a single separate storage structure ("memory") to hold both instructions
and data. It is named after the mathematician and early computer scientist John von Neumann. Such
computers implement a universal Turing machine and have a sequential architecture.

A stored-program digital computer is one that keeps its programmed instructions, as well as its data,


in read-write, random-access memory (RAM). Stored-program computers were an advancement over the
program-controlled computers of the 1940s, such as the Colossus and the ENIAC, which were programmed
by setting switches and inserting patch leads to route data and to control signals between various functional
units. In the vast majority of modern computers, the same memory is used for both data and program
instructions. The mechanisms for transferring the data and instructions between the CPU and memory are,
however, considerably more complex than the original von Neumann architecture.

The terms "von Neumann architecture" and "stored-program computer" are generally used interchangeably,
and that usage is followed in this article.
Contents
 [hide]

1 Description

2 Development of the stored-program

concept

3 Early von Neumann-architecture

computers

4 Early stored-program computers

5 Non-von Neumann processors

6 See also

7 References

o 7.1 Inline

o 7.2 General

8 External links

[edit]Description

The earliest computing machines had fixed programs. Some very simple computers still use this design,
either for simplicity or training purposes. For example, a desk calculator (in principle) is a fixed program
computer. It can do basic mathematics, but it cannot be used as a word processor or a gaming console.
Changing the program of a fixed-program machine requires re-wiring, re-structuring, or re-designing the
machine. The earliest computers were not so much "programmed" as they were "designed".
"Reprogramming", when it was possible at all, was a laborious process, starting withflowcharts and paper
notes, followed by detailed engineering designs, and then the often-arduous process of physically re-wiring
and re-building the machine. It could take three weeks to set up a program on ENIAC and get it working.[1]

The idea of the stored-program computer changed all that: a computer that by design includes
an instruction set and can store in memory a set of instructions (a program) that details the computation.

A stored-program design also lets programs modify themselves while running. One early motivation for
such a facility was the need for a program to increment or otherwise modify the address portion of
instructions, which had to be done manually in early designs. This became less important when index
registers and indirect addressing became usual features of machine architecture. Self-modifying code has
largely fallen out of favor, since it is usually hard to understand and debug, as well as being inefficient under
modern processor pipelining and caching schemes.

On a large scale, the ability to treat instructions as data is what makes assemblers, compilers and other
automated programming tools possible. One can "write programs which write programs". [2] On a smaller
scale, I/O-intensive machine instructions such as the BITBLT primitive used to modify images on a bitmap
display, were once thought to be impossible to implement without custom hardware. It was shown later that
these instructions could be implemented efficiently by "on the fly compilation" ("just-in-time compilation")
technology, e.g., code-generating programs—one form of self-modifying code that has remained popular.

There are drawbacks to the von Neumann design. Aside from the von Neumann bottleneck described
below, program modifications can be quite harmful, either by accident or design. In some simple stored-
program computer designs, a malfunctioning program can damage itself, other programs, or the operating
system, possibly leading to a computer crash. Memory protection and other forms ofaccess control can
usually protect against both accidental and malicious program modification.

[edit]Development of the stored-program concept

The mathematician Alan Turing, who had been alerted to a problem of mathematical logic by the lectures
of Max Newman at the University of Cambridge, wrote a paper in 1936 entitled On Computable Numbers,
with an Application to the Entscheidungsproblem, which was published in the Proceedings of the London
Mathematical Society.[3] In it he described a hypothetical machine which he called a "universal computing
machine", and which is now known as the "universal Turing machine". The hypothetical machine had an
infinite store (memory in today's terminology) that contained both instructions and data. The German
engineer Konrad Zuse independently wrote about this concept in 1936.[4] John von Neumann became
acquainted with Turing when he was a visiting professor at Cambridge in 1935 and also during the year that
Turing spent at Princeton University in 1936-37. Whether he knew of Turing's 1936 paper at that time is not
clear.

Independently, J. Presper Eckert and John Mauchly, who were developing the ENIAC at the Moore School
of Electrical Engineering, at the University of Pennsylvania, wrote about the stored-program concept in
December 1943.[5][6] In planning a new machine, EDVAC, Eckert wrote in January 1944 that they would
store data and programs in a new addressable memory device, a mercury metaldelay line memory. This
was the first time the construction of a practical stored-program was proposed. At that time, they were not
aware of Turing's work.

Von Neumann was involved in the Manhattan Project at the Los Alamos National Laboratory, which
required huge amounts of calculation. This drew him to the ENIAC project, in the summer of 1944. There he
joined into the ongoing discussions on the design of this stored-program computer, the EDVAC. As part of
that group, he volunteered to write up a description of it. The term "von Neumann architecture" arose from
von Neumann's paper First Draft of a Report on the EDVAC dated 30 June 1945, which included ideas from
Eckert and Mauchly. It was unfinished when his colleague Herman Goldstine circulated it with only von
Neumann's name on it, to the consternation of Eckert and Mauchly. [7] The paper was read by dozens of von
Neumann's colleagues in America and Europe, and influenced the next round of computer designs.

Von Neumann was, then, not alone in putting forward the idea of the stored-program architecture, and Jack
Copeland considers that it is "historically inappropriate, to refer to electronic stored-program digital
computers as 'von Neumann machines'". [8] His Los Alamos colleague Stan Frankel said of his regard for
Turing's ideas:
I know that in or about 1943 or '44 von Neumann was well aware of the fundamental importance of Turing's
paper of 1936 ... Von Neumann introduced me to that paper and at his urging I studied it with care. Many
people have acclaimed von Neumann as the "father of the computer" (in a modern sense of the term) but I
am sure that he would never have made that mistake himself. He might well be called the midwife, perhaps,
but he firmly emphasized to me, and to others I am sure, that the fundamental conception is owing to Turing
— in so far as not anticipated by Babbage ... Both Turing and von Neumann, of course, also made
substantial contributions to the "reduction to practice" of these concepts but I would not regard these as
comparable in importance with the introduction and explication of the concept of a computer able to store in
its memory its program of activities and of modifying that program in the course of these activities. [9]

Later, Turing produced a detailed technical report Proposed Electronic Calculator describing the Automatic


Computing Engine (ACE).[10] He presented this to the Executive Committee of the BritishNational Physical
Laboratory on 19 February 1946. Although Turing knew from his wartime experience at Bletchley Park that
what he proposed was feasible, the secrecy that was maintained aboutColossus for several decades
prevented him from saying so. Various successful implementations of the ACE design were produced.

Both von Neumann's and Turing's papers described stored program-computers, but von Neumann's earlier
paper achieved greater circulation and the computer architecture it outlined became known as the "von
Neumann architecture". In the 1953 book Faster than Thought (edited by B.V. Bowden), a section in the
chapter on Computers in America reads as follows:[11]

THE MACHINE OF THE INSTITUTE FOR ADVANCED STUDIES, PRINCETON

In 1945, Professor J. von Neumann, who was then working at the Moore School of Engineering in
Philadelphia, where the E.N.I.A.C. had been built, issued on behalf of a group of his co-workers a report on
the logical design of digital computers. The report contained a fairly detailed proposal for the design of the
machine which has since become known as the E.D.V.A.C. (electronic discrete variable automatic
computer). This machine has only recently been completed in America, but the von Neumann report
inspired the construction of the E.D.S.A.C. (electronic delay-storage automatic calculator) in Cambridge
(see page 130).

In 1947, Burks, Goldstine and von Neumann published another report which outlined the design of another
type of machine (a parallel machine this time) which should be exceedingly fast, capable perhaps of 20,000
operations per second. They pointed out that the outstanding problem in constructing such a machine was
in the development of a suitable memory, all the contents of which were instantaneously accessible, and at
first they suggested the use of a special tube—called the Selectron, which had been invented by the
Princeton Laboratories of the R.C.A. These tubes were expensive and difficult to make, so von Neumann
subsequently decided to build a machine based on the Williams memory. This machine, which was
completed in June, 1952 in Princeton has become popularly known as the Maniac. The design of this
machine has inspired that of half a dozen or more machines which are now being built in America, all of
which are known affectionately as "Johniacs."'
In the same book, the first two paragraphs of a chapter on ACE read as follows: [12]

AUTOMATIC COMPUTATION AT THE NATIONAL PHYSICAL LABORATORY'

One of the most modern digital computers which embodies developments and improvements in the
technique of automatic electronic computing was recently demonstrated at the National Physical
Laboratory, Teddington, where it has been designed and built by a small team of mathematicians and
electronics research engineers on the staff of the Laboratory, assisted by a number of production engineers
from the English Electric Company, Limited. The equipment so far erected at the Laboratory is only the pilot
model of a much larger installation which will be known as the Automatic Computing Engine, but although
comparatively small in bulk and containing only about 800 thermionic valves, as can be judged from Plates
XII, XIII and XIV, it is an extremely rapid and versatile calculating machine.

The basic concepts and abstract principles of computation by a machine were formulated by Dr. A. M.
Turing, F.R.S., in a paper1. read before the London Mathematical Society in 1936, but work on such
machines in Britain was delayed by the war. In 1945, however, an examination of the problems was made
at the National Physical Laboratory by Mr. J. R. Womersley, then superintendent of the Mathematics
Division of the Laboratory. He was joined by Dr. Turing and a small staff of specialists, and, by 1947, the
preliminary planning was sufficiently advanced to warrant the establishment of the special group already
mentioned. In April, 1948, the latter became the Electronics Section of the Laboratory, under the charge of
Mr. F. M. Colebrook.

needed|date=December 2010}} Modern functional programming and object-oriented programming are


much less geared towards "pushing vast numbers of words back and forth" than earlier languages
like Fortran were, but internally, that is still what computers spend much of their time doing, even highly
parallel supercomputers.

[edit]Early von Neumann-architecture computers

The First Draft described a design that was used by many universities and corporations to construct their
computers.[13] Among these various computers, only ILLIAC and ORDVAC had compatible instruction sets.

 ORDVAC (U-Illinois) at Aberdeen Proving Ground, Maryland (completed Nov 1951 [14])

 IAS machine at Princeton University (Jan 1952)

 MANIAC I at Los Alamos Scientific Laboratory (Mar 1952)

 ILLIAC at the University of Illinois, (Sept 1952)

 AVIDAC at Argonne National Laboratory (1953)

 ORACLE at Oak Ridge National Laboratory (Jun 1953)

 JOHNNIAC at RAND Corporation (Jan 1954)

 BESK in Stockholm (1953)

 BESM-1 in Moscow (1952)

 DASK in Denmark (1955)


 PERM in Munich (1956?)

 SILLIAC in Sydney (1956)

 WEIZAC in Rehovoth (1955)


[edit]Early stored-program computers

The date information in the following chronology is difficult to put into proper order. Some dates are for first
running a test program, some dates are the first time the computer was demonstrated or completed, and
some dates are for the first delivery or installation.

 The IBM SSEC was a stored-program electromechanical computer and was publicly demonstrated


on January 27, 1948. However it was partially electromechanical, thus not fully electronic.

 The Manchester SSEM (the Baby) was the first fully electronic computer to run a stored program. It
ran a factoring program for 52 minutes on June 21, 1948, after running a simple division program and a
program to show that two numbers were relatively prime.

 The ENIAC was modified to run as a primitive read-only stored-program computer (using the


Function Tables for program ROM) and was demonstrated as such on September 16, 1948, running a
program by Adele Goldstine for von Neumann.

 The BINAC ran some test programs in February, March, and April 1949, although it wasn't
completed until September 1949.

 The Manchester Mark 1 developed from the SSEM project. An intermediate version of the Mark 1
was available to run programs in April 1949, but it wasn't completed until October 1949.

 The EDSAC ran its first program on May 6, 1949.

 The EDVAC was delivered in August 1949, but it had problems that kept it from being put into
regular operation until 1951.

 The CSIR Mk I ran its first program in November 1949.

 The SEAC was demonstrated in April 1950.

 The Pilot ACE ran its first program on May 10, 1950 and was demonstrated in December 1950.

 The SWAC was completed in July 1950.

 The Whirlwind was completed in December 1950 and was in actual use in April 1951.

 The first ERA Atlas (later the commercial ERA 1101/UNIVAC 1101) was installed in December
1950.
[edit]Non-von Neumann processors

The NEC µPD7281D pixel processor was the first non-von Neumann microprocessor. [citation needed]

Perhaps the most common kind of non-von Neumann structure used in modern computers is content-
addressable memory (CAM).

In some cases, emerging memristor technology may be able to circumvent the von Neumann bottleneck. [15]
[edit]See also
Computer science portal

 Harvard architecture

 Modified Harvard architecture

 Turing machine

 Random access machine

 Little man computer

 CARDboard Illustrative Aid to Computation

 Von Neumann syndrome

 Interconnect bottleneck

What is the difference between a von


Neumann architecture and a Harvard
architecture?
Applies to: ARM1020/22E, ARM1026EJ-S, ARM1136, ARM720T, ARM7EJ-
S, ARM7TDMI, ARM7TDMI-S, ARM920/922T,ARM926EJ-S, ARM940T, ARM
946E-S, ARM966E-S, ARM9TDMI

Harvard architecture has separate data and instruction busses, allowing


transfers to be performed simultaneously on both busses. A von
Neumann architecture has only one bus which is used for both data
transfers and instruction fetches, and therefore data transfers and
instruction fetches must be scheduled - they can not be performed at the
same time.

It is possible to have two separate memory systems for a Harvard


architecture. As long as data and instructions can be fed in at the same
time, then it doesn't matter whether it comes from a cache or memory.
But there are problems with this. Compilers generally embed data (literal
pools) within the code, and it is often also necessary to be able to write to
the instruction memory space, for example in the case of self modifying
code, or, if an ARM debugger is used, to set software breakpoints in
memory. If there are two completely separate, isolated memory systems,
this is not possible. There must be some kind of bridge between the
memory systems to allow this.

Using a simple, unified memory system together with a Harvard


architecture is highly inefficient. Unless it is possible to feed data into both
busses at the same time, it might be better to use a von Neumann
architecture processor.
Use of caches

At higher clock speeds, caches are useful as the memory speed is


proportionally slower. Harvard architectures tend to be targeted at
higher performance systems, and so caches are nearly always used in
such systems.

Von Neumann architectures usually have a single unified cache, which


stores both instructions and data. The proportion of each in the cache is
variable, which may be a good thing. It would in principle be possible to
have separate instruction and data caches, storing data and instructions
separately. This probably would not be very useful as it would only be
possible to ever access one cache at a time.

Caches for Harvard architectures are very useful. Such a system would
have separate caches for each bus. Trying to use a shared cache on a
Harvard architecture would be very inefficient since then only one bus can
be fed at a time. Having two caches means it is possible to feed both
buses simultaneously....exactly what is necessary for a Harvard
architecture.

This also allows to have a very simple unified memory system, using the
same address space for both instructions and data. This gets around the
problem of literal pools and self modifying code. What it does mean,
however, is that when starting with empty caches, it is necessary to fetch
instructions and data from the single memory system, at the same time.
Obviously, two memory accesses are needed therefore before the core
has all the data needed. This performance will be no better than a von
Neumann architecture. However, as the caches fill up, it is much more
likely that the instruction or data value has already been cached, and so
only one of the two has to be fetched from memory. The other can be
supplied directly from the cache with no additional delay. The best
performance is achieved when both instructions and data are supplied by
the caches, with no need to access external memory at all.

This is the most sensible compromise and the architecture used by ARMs
Harvard processor cores. Two separate memory systems can perform
better, but would be difficult to implement.

You might also like