COA Unit3
COA Unit3
COA Unit3
• A number system of base, or radix, r is a system that uses distinct symbols for r
digits
• Numbers are represented by a string of digit symbols
• The string of digits 724.5 represents the quantity
• The string of digits 101101 in the binary number system represents the quantity
1 x 25 + 0 x 24 + 1 x 23 + 1 x 22 + 0 x 21 + 1 x 20 = 45
• (101101)2 = (45)10
• We will also use the octal (radix 8) and hexidecimal (radix 16) number systems
Computer Architecture 1
Chapter 3
• Each octal digit corresponds to three binary digits
• Each hexadecimal digit corresponds to four binary digits
• Rather than specifying numbers in binary form, refer to them in octal or
hexadecimal and reduce the number of digits by 1/3 or ¼, respectively
Computer Architecture 2
Chapter 3
Computer Architecture 3
Chapter 3
• A binary code is a group of n bits that assume up to 2n distinct combinations
• A four bit code is necessary to represent the ten decimal digits – 6 are unused
• The most popular decimal code is called binary-coded decimal (BCD)
• BCD is different from converting a decimal number to binary
• For example 99, when converted to binary, is 1100011
• 99 when represented in BCD is 1001 1001
Computer Architecture 4
Chapter 3
• The standard alphanumeric binary code is ASCII
• This uses seven bits to code 128 characters
• Binary codes are required since registers can hold binary information only
Computer Architecture 5
Chapter 3
Section 3.2 – Complements
• Complements are used in digital computers for simplifying subtraction and logical
manipulation
• Two types of complements for each base r system: r’s complement and (r – 1)’s
complement
• Given a number N in base r having n digits, the (r – 1)’s complement of N is
defined as (rn – 1) – N
Computer Architecture 6
Chapter 3
• The 9’s complement of 453299 is 999999 – 453299 = 546700
• For binary, the 1’s complement of N is (2n – 1) – N
• The 1’s complement of 1011001 is 1111111 – 1011001 = 0100110
• The 1’s complement is the true complement of the number – just toggle all bits
M = 72352
10’s comp. of N = +86750
Sum = 159282
Discard end carry = -100000
Answer = 59282
M = 13250
10’s comp. of N = +27468
Sum = 40718
No end carry
Answer = -59282 (10’s comp. of 40718)
X = 1010100
2’s comp. of Y = +0111101
Sum = 10010001
Discard end carry = -10000000
Answer X – Y = 0010001
Y = 1000011
2’s comp. of X = +0101100
Sum = 1101111
Computer Architecture 7
Chapter 3
No end carry
Answer = -0010001 (2’s comp. of 1101111)
• When an integer is positive, the msb, or sign bit, is 0 and the remaining bits
represent the magnitude
• When an integer is negative, the msb, or sign bit, is 1, but the rest of the number
can be represented in one of three ways
o Signed-magnitude representation
o Signed-1’s complement representation
o Signed-2’s complement representation
Computer Architecture 8
Chapter 3
+6 00000110 -6 11111010
+13 00001101 +13 00001101
+19 00010011 +7 00000111
+6 00000110 -6 11111010
-13 11110011 -13 11110011
-7 11111001 -19 11101101
• An overflow occurs when two numbers of n digits each are added and the sum
occupies n + 1 digits
• Overflows are problems since the width of a register is finite
• Therefore, a flag is set if this occurs and can be checked by the user
• Detection of an overflow depends on if the numbers are signed or unsigned
• For unsigned numbers, an overflow is detected from the end carry out of the msb
• For addition of signed numbers, an overflow cannot occur if one is positive and
one is negative – both have to have the same sign
• An overflow can be detected if the carry into the sign bit position and the carry
out of the sign bit position are not equal
Computer Architecture 9
Chapter 3
0 375 (0000 0011 0111 1010)BCD
+9 760 (1001 0111 0110 0000)BCD
0 135 (0000 0001 0011 0101)BCD
Computer Architecture 10
Chapter 3
• Binary codes for decimal digits require a minimum of four bits
• Other codes besides BCD exist to represent decimal digits
Computer Architecture 11
Chapter 3
• The 2421 code and the excess-3 code are both self-complementing
• The 9’s complement of each digit is obtained by complementing each bit in the
code
• The 2421 code is a weighted code
• The bits are multiplied by indicated weights and the sum gives the decimal digit
• The excess-3 code is obtained from the corresponding BCD code added to 3
Computer Architecture 12
Chapter 3
• A parity bit is an extra bit included with a binary message to make the total
number of 1’s either odd or even
• The P(odd) bit is chosen to make the sum of 1’s in all four bits odd
• The even-parity scheme has the disadvantage of having a bit combination of all
0’s
• Procedure during transmission:
o At the sending end, the message is applied to a parity generator
o The message, including the parity bit, is transmitted
o At the receiving end, all the incoming bits are applied to a parity checker
o Any odd number of errors are detected
• Parity generators and checkers are constructed with XOR gates (odd function)
• An odd function generates 1 iff an odd number if input variables are 1
Computer Architecture 13
Chapter 3
Computer Architecture 14
Chapter 3
UNIT-IV
COMPUTER ARITHMETIC
Introduction:
Data is manipulated by using the arithmetic instructions in digital computers. Data is manipulated
to produce results necessary to give solution for the computation problems. The Addition,
subtraction, multiplication and division are the four basic arithmetic operations. If we want then
we can derive other operations by using these four operations.
To execute arithmetic operations there is a separate section called arithmetic processing unit in
central processing unit. The arithmetic instructions are performed generally on binary or decimal
data. Fixed-point numbers are used to represent integers or fractions. We can have signed or
unsigned negative numbers. Fixed-point addition is the simplest arithmetic operation.
If we want to solve a problem then we use a sequence of well-defined steps. These steps are
collectively called algorithm. To solve various problems we give algorithms.
In order to solve the computational problems, arithmetic instructions are used in digital computers
that manipulate data. These instructions perform arithmetic calculations.
And these instructions perform a great activity in processing data in a digital computer.
As we already stated that with the four basic arithmetic operations addition, subtraction,
multiplication and division, it is possible to derive other arithmetic operations and solve
scientific problems by means of numerical analysis methods.
A processor has an arithmetic processor(as a sub part of it) that executes arithmetic
operations. The data type, assumed to reside in processor, registers during the execution
of an arithmetic instruction. Negative numbers may be in a signed magnitude or signed
complement representation. There are three ways of representing negative fixed point -
binary numbers signed magnitude, signed 1’s complement or signed 2’s complement.
Most computers use the signed magnitude representation for the mantissa.
We designate the magnitude of the two numbers by A and B. Where the signed numbers
are added or subtracted, we find that there are eight different conditions to consider,
depending on the sign of the numbers and the operation performed. These conditions are
listed in the first column of Table 4.1. The other columns in the table show the actual
operation to be performed with the magnitude of the numbers. The last column is needed
to present a negative zero. In other words, when two equal numbers are subtracted, the
result should be +0 not -0.
The algorithms for addition and subtraction are derived from the table and can be stated
as follows (the words parentheses should be used for the subtraction algorithm)
UNIT-IV 1
Addition and Subtraction of Signed-Magnitude Numbers
E Output Input
Parallel Adder
Carry Carry
S
As A Register Load Sum
V Complementer and
Parallel Adder
Overflow
AC
Algorithm
Subtract Add
Minuend in AC Augend in AC
Subtrahend in B Addend in B
AC AC + B’+ 1 AC AC + B
V overflow V overflow
END END
UNIT-IV 2
Algorithm:
The flowchart is shown in Figure 7.1. The two signs A, and B, are compared by an
exclusive-OR gate.
For an add operation, identical signs dictate that the magnitudes be added. For a
subtract operation, different signs dictate that the magnitudes be added.
The magnitudes are added with a microoperation EA A + B, where EA is a register
that combines E and A. The carry in E after the addition constitutes an overflow if it is
equal to 1. The value of E is transferred into the add-overflow flip-flop AVF.
The two magnitudes are subtracted if the signs are different for an add operation or
identical for a subtract operation. The magnitudes are subtracted by adding A to the
2's complemented B. No overflow can occur if the numbers are subtracted so AVF is
cleared to 0.
1 in E indicates that A >= B and the number in A is the correct result. If this numbs is
zero, the sign A must be made positive to avoid a negative zero.
0 in E indicates that A < B. For this case it is necessary to take the 2's complement of
the value in A. The operation can be done with one microoperation A A' +1.
However, we assume that the A register has circuits for microoperations complement
and increment, so the 2's complement is obtained from these two microoperations.
In other paths of the flowchart, the sign of the result is the same as the sign of A. so no
change in A is required. However, when A < B, the sign of the result is the
complement of the original sign of A. It is then necessary to complement A, to obtain
the correct sign.
The final result is found in register A and its sign in As. The value in AVF provides an
overflow indication. The final value of E is immaterial.
Figure 7.2 shows a block diagram of the hardware for implementing the addition and
subtraction operations.
The add-overflow flip-flop AVF holds the overflow bit when A and B are added.
The A register provides other microoperations that may be needed when we specify
the sequence of steps in the algorithm.
UNIT-IV 3
Multiplication Algorithm:
In the beginning, the multiplicand is in B and the multiplier in Q. Their corresponding signs are
in Bs and Qs respectively. We compare the signs of both A and Q and set to corresponding sign
of the product since a double-length product will be stored in registers A and Q. Registers A and
E are cleared and the sequence counter SC is set to the number of bits of the multiplier. Since an
operand must be stored with its sign, one bit of the word will be occupied by the sign and the
magnitude will consist of n-1 bits.
Now, the low order bit of the multiplier in Qn is tested. If it is 1, the multiplicand (B) is added to
present partial product (A), 0 otherwise. Register EAQ is then shifted once to the right to form the
new partial product. The sequence counter is decremented by 1 and its new value checked. If it is
not equal to zero, the process is repeated and a new partial product is formed. When SC = 0 we
stops the process.
UNIT-IV 4
Booth’s algorithm :
Booth algorithm gives a procedure for multiplying binary integers in signed- 2’s
complement representation.
It operates on the fact that strings of 0’s in the multiplier require no addition but just
UNIT-IV 5
shifting, and a string of 1’s in the multiplier from bit weight 2k to weight 2m can be
treated as 2k+1 – 2m.
For example, the binary number 001110 (+14) has a string 1’s from 23 to 21 (k=3,
m=1). The number can be represented as 2k+1 – 2m. = 24 – 21 = 16 – 2 = 14. Therefore,
the multiplication M X 14, where M is the multiplicand and 14 the multiplier, can be
done as M X 24 – M X 21.
Thus the product can be obtained by shifting the binary multiplicand M four times to
the left and subtracting M shifted left once.
Prior to the shifting, the multiplicand may be added to the partial product, subtracted
from the partial, or left unchanged according to the following rules:
UNIT-IV 6
1. The multiplicand is subtracted from the partial product upon encountering the
first least significant 1 in a string of 1’s in the multiplier.
2. The multiplicand is added to the partial product upon encountering the first 0
in a string of 0’s in the multiplier.
3. The partial product does not change when multiplier bit is identical to the
previous multiplier bit.
This is because a negative multiplier ends with a string of 1’s and the last operation
will be a subtraction of the appropriate weight.
If the two bits are equal to 10, it means that the first 1 in a string of 1 's has been
encountered. This requires a subtraction of the multiplicand from the partial product in
AC.
If the two bits are equal to 01, it means that the first 0 in a string of 0's has been
encountered. This requires the addition of the multiplicand to the partial product in
AC.
When the two bits are equal, the partial product does not change.
Division Algorithms
Division of two fixed-point binary numbers in signed magnitude representation is performed with
paper and pencil by a process of successive compare, shift and subtract operations. Binary
division is much simpler than decimal division because here the quotient digits are either 0 or 1
and there is no need to estimate how many times the dividend or partial remainder fits into the
divisor. The division process is described in Figure
The devisor is compared with the five most significant bits of the dividend. Since the
5-bit number is smaller than B, we again repeat the same process. Now the 6-bit
number is greater than B, so we place a 1 for the quotient bit in the sixth position
above the dividend. Now we shift the divisor once to the right and subtract it from the
dividend. The difference is known as a partial remainder because the division could
have stopped here to obtain a quotient of 1 and a remainder equal to the partial
UNIT-IV 7
remainder. Comparing a partial remainder with the divisor continues the process. If
the partial remainder is greater than or equal to the divisor, the quotient bit is equal to
1. The divisor is then shifted right and subtracted from the partial remainder. If the
partial remainder is smaller than the divisor, the quotient bit is 0 and no subtraction is
needed. The divisor is shifted once to the right in any case. Obviously the result gives
both a quotient and a remainder.
Algorithm:
UNIT-IV 8
Example of Binary Division with Digital Hardware
UNIT-IV 9
In many high-level programming languages we have a facility for specifying floating-point
numbers. The most common way is by a real declaration statement. High level programming
languages must have a provision for handling floating-point arithmetic operations. The operations
are generally built in the internal hardware. If no hardware is available, the compiler must be
designed with a package of floating-point software subroutine. Although the hardware method is
more expensive, it is much more efficient than the software method. Therefore, floating- point
hardware is included in most computers and is omitted only in very small ones.
Basic Considerations :
There are two part of a floating-point number in a computer - a mantissa m and an exponent e.
The two parts represent a number generated from multiplying m times a radix r raised to the value
of e. Thus
m x re
The mantissa may be a fraction or an integer. The position of the radix point and the value of the
radix r are not included in the registers. For example, assume a fraction representation and a radix
10. The decimal number 537.25 is represented in a register with m = 53725 and e = 3 and is
interpreted to represent the floating-point number
.53725 x 103
A floating-point number is said to be normalized if the most significant digit of the mantissa in
nonzero. So the mantissa contains the maximum possible number of significant digits. We cannot
normalize a zero because it does not have a nonzero digit. It is represented in floating-point by all
0’s in the mantissa and exponent.
Floating-point representation increases the range of numbers for a given register. Consider a
computer with 48-bit words. Since one bit must be reserved for the sign, the range of fixed-point
integer numbers will be + (247 – 1), which is approximately + 1014. The 48 bits can be used to
represent a floating-point number with 36 bits for the mantissa and 12 bits for the exponent.
Assuming fraction representation for the mantissa and taking the two sign bits into consideration,
the range of numbers that can be represented is
+ (1 – 2-35) x 22047
This number is derived from a fraction that contains 35 1’s, an exponent of 11 bits (excluding its
sign), and because 211–1 = 2047. The largest number that can be accommodated is approximately
10615. The mantissa that can accommodated is 35 bits (excluding the sign) and if considered as an
integer it can store a number as large as (235 –1). This is approximately equal to 1010, which is
equivalent to a decimal number of 10 digits.
Computers with shorter word lengths use two or more words to represent a floating-point number.
An 8-bit microcomputer uses four words to represent one floating-point number. One word of 8
bits are reserved for the exponent and the 24 bits of the other three words are used in the
mantissa.
UNIT-IV 10
Arithmetic operations with floating-point numbers are more complicated than with fixed-point
numbers. Their execution also takes longer time and requires more complex hardware. Adding or
subtracting two numbers requires first an alignment of the radix point since the exponent parts
must be made equal before adding or subtracting the mantissas. We do this alignment by shifting
one mantissa while its exponent is adjusted until it becomes equal to the other exponent. Consider
the sum of the following floating-point numbers:
.5372400 x 102
+ .1580000 x 10-1
Floating-point multiplication and division need not do an alignment of the mantissas. Multiplying
the two mantissas and adding the exponents can form the product. Dividing the mantissas and
subtracting the exponents perform division.
The operations done with the mantissas are the same as in fixed-point numbers, so the two can
share the same registers and circuits. The operations performed with the exponents are compared
and incremented (for aligning the mantissas), added and subtracted (for multiplication) and
division), and decremented (to normalize the result). We can represent the exponent in any one of
the three representations - signed-magnitude, signed 2’s complement or signed 1’s complement.
Biased exponents have the advantage that they contain only positive numbers. Now it becomes
simpler to compare their relative magnitude without bothering about their signs. Another
advantage is that the smallest possible biased exponent contains all zeros. The floating-point
representation of zero is then a zero mantissa and the smallest possible exponent.
Register Configuration
The register configuration for floating-point operations is shown in figure 4.13. As a rule, the
same registers and adder used for fixed-point arithmetic are used for processing the mantissas.
The difference lies in the way the exponents are handled.
The register organization for floating-point operations is shown in Fig. 4.13. Three registers are
there, BR, AC, and QR. Each register is subdivided into two parts. The mantissa part has the
same uppercase letter symbols as in fixed-point representation. The exponent part may use
corresponding lower-case letter symbol.
UNIT-IV 11
Computer Arithmetic 14 Floating Point Arithmetic
F = m x re
where m: Mantissa
r: Radix
e: Exponent
Bs B b BR
Parallel Adder
E Parallel Adder and Comparator
As A1 A a AC
Qs Q q QR
Assuming that each floating-point number has a mantissa in signed-magnitude representation and
a biased exponent. Thus the AC has a mantissa whose sign is in As, and a magnitude that is in A.
The diagram shows the most significant bit of A, labeled by A1. The bit in his position must be a
1 to normalize the number. Note that the symbol AC represents the entire register, that is, the
concatenation of As, A and a.
In the similar way, register BR is subdivided into Bs, B, and b and QR into Qs, Q and q. A
parallel-adder adds the two mantissas and loads the sum into A and the carry into E. A separate
parallel adder can be used for the exponents. The exponents do not have a district sign bit because
they are biased but are represented as a biased positive quantity. It is assumed that the floating-
point number are so large that the chance of an exponent overflow is very remote and so the
exponent overflow will be neglected. The exponents are also connected to a magnitude
comparator that provides three binary outputs to indicate their relative magnitude.
The number in the mantissa will be taken as a fraction, so they binary point is assumed to reside
to the left of the magnitude part. Integer representation for floating point causes certain scaling
problems during multiplication and division. To avoid these problems, we adopt a fraction
representation.
The numbers in the registers should initially be normalized. After each arithmetic operation, the
result will be normalized. Thus all floating-point operands are always normalized.
UNIT-IV 12
Addition and Subtraction of Floating Point Numbers
During addition or subtraction, the two floating-point operands are kept in AC and BR. The sum
or difference is formed in the AC. The algorithm can be divided into four consecutive parts:
If the magnitudes were subtracted, there may be zero or may have an underflow in the result. If the
mantissa is equal to zero the entire floating-point number in the AC is cleared to zero. Otherwise, the
mantissa must have at least one bit that is equal to 1. The mantissa has an underflow if the most
significant bit in position A1, is 0. In that case, the mantissa is shifted left and the exponent decremented.
The bit in A1 is checked again and the process is repeated until A1 = 1. When A1 = 1, the mantissa is
normalized and the operation is completed.
UNIT-IV 13
Algorithm for Floating Point Addition and Subtraction
UNIT-IV 14
Multiplication:
A A+B A A+B
shr A
a a+1
a a+b’+1
a a+bias
qa
MEMORY ORGANIZATION
UNIT-IV 15
: Memory Hierarchy, Main Memory, Auxiliary memory, Associative
Memory Hierarchy :
Memory Organization 2 Memory Hierarchy
MEMORY HIERARCHY
CPU Cache
memory
Register
Cache
Main Memory
Magnetic Disk
Magnetic Tape
Main Memory
Primary memory holds only those data and instructions on which computer is currently working.
It has limited capacity and data is lost when power is switched off.
It is generally made up of semiconductor device.
These memories are not as fast as registers.
The data and instruction required to be processed reside in main memory.
It is divided into two subcategories RAM and ROM.
Memory address map of RAM and ROM
The designer of a computer system must calculate the amount of memory required for the
particular application and assign it to either RAM or ROM.
The interconnection between memory and processor is then established from knowledge of the
size of memory needed and the type of RAM and ROM chips available.
The addressing of memory can be established by means of a table that specifies the memory
address assigned to each chip.
The table, called a memory address map, is a pictorial representation of assigned address space
for each chip in the system, shown in table 9.1.
UNIT-IV 16
To demonstrate with a particular example, assume that a computer system needs 512 bytes of
RAM and 512 bytes of ROM.
The RAM and ROM chips to be used are specified in figure 9.1 and figure 9.2.
Memory address map of RAM and ROM
The hexadecimal address column assigns a range of hexadecimal equivalent addresses for each
chip.
Although there are 16 lines in the address bus, the table shows only 10 lines because the other 6
are not used in this example and are assumed to be zero.
The small x's under the address bus lines designate those lines that must be connected to the
address inputs in each chip.
The RAM chips have 128 bytes and need seven address lines. The ROM chip has 512 bytes and
needs 9 address lines.
The x's are always assigned to the low-order bus lines: lines 1 through 7 for the RAM and lines 1
through 9 for the ROM.
UNIT-IV 17
It is now necessary to distinguish between four RAM chips by assigning to each a different
address. For this particular example we choose bus lines 8 and 9 to represent four distinct binary
combinations.
The table clearly shows that the nine low-order bus lines constitute a memory space for RAM
equal to 29 = 512 bytes.
The distinction between a RAM and ROM address is done with another bus line. Here we choose
line 10 for this purpose.
When line 10 is 0, the CPU selects a RAM, and when this line is equal to 1, it selects the ROM
- RAM and ROM chips are connected to a CPU through the data and address buses
- The low-order lines in the address bus select the byte within the chips and other lines in the address bus select a
particular chip through its chip select inputs.
Decoder
3 2 1 0
CS1
CS2
Data
RD 128 x 8
RAM 1
WR
AD7
CS1
CS2
Data
RD 128 x 8
RAM 2
WR
AD7
CS1
CS2
Data
RD 128 x 8
RAM 3
WR
AD7
CS1
CS2
Data
RD 128 x 8
RAM 4
WR
AD7
CS1
CS2
Data
1- 7 512 x 8
8
9 } AD9 ROM
Auxiliary Memory :
Magnetic Tape: Magnetic tapes are used for large computers like mainframe computers where large volume
of data is stored for a longer time. In PC also you can use tapes in the form of cassettes. The cost of storing
data in tapes is inexpensive. Tapes consist of magnetic materials that store data permanently. It can be 12.5
mm to 25 mm wide plastic film-type and 500 meter to 1200 meter long which is coated with magnetic
material. The deck is connected to the central processor and information is fed into or read from the tape
through the processor. It’s similar to cassette tape recorder.
UNIT-IV 18
Magnetic tape is an information storage medium consisting of a magnetisable coating on a thin plastic strip. Nearly
all recording tape is of this type, whether used for video with a video cassette recorder, audio storage (reel-to-reel
tape, compact audio cassette, digital audio tape (DAT), digital linear tape (DLT) and other formats including 8-track
cartridges) or general purpose digital data storage using a computer (specialized tape formats, as well as the above-
mentioned compact audio cassette, used with home computers of the 1980s, and DAT, used for backup in workstation
installations of the 1990s).
Magneto-optical and optical tape storage products have been developed using many of the same concepts as
magnetic storage, but have achieved little commercial success.
Magnetic Disk: You might have seen the gramophone record, which is circular like a disk and coated with
magnetic material. Magnetic disks used in computer are made on the same principle. It rotates with very high
speed inside the computer drive. Data is stored on both the surface of the disk. Magnetic disks are most
popular for direct access storage device. Each disk consists of a number of invisible concentric circles called
tracks. Information is recorded on tracks of a disk surface in the form of tiny magnetic spots. The presence of
a magnetic spot represents one bit and its absence represents zero bit. The information stored in a disk can be
read many times without affecting the stored data. So the reading operation is non-destructive. But if you
want to write a new data, then the existing data is erased from the disk and new data is recorded. For
Example-Floppy Disk.
The primary computer storage device. Like tape, it is magnetically recorded and can be re-recorded over and over.
Disks are rotating platters with a mechanical arm that moves a read/write head between the outer and inner edges of
the platter's surface. It can take as long as one second to find a location on a floppy disk to as little as a couple of
milliseconds on a fast hard disk. See hard disk for more details.
The disk surface is divided into concentric tracks (circles within circles). The thinner the tracks, the more storage.
The data bits are recorded as tiny magnetic spots on the tracks. The smaller the spot, the more bits per inch and the
greater the storage.
Sectors
Tracks are further divided into sectors, which hold a block of data that is read or written at one time; for example,
READ SECTOR 782, WRITE SECTOR 5448. In order to update the disk, one or more sectors are read into the
computer, changed and written back to disk. The operating system figures out how to fit data into these fixed spaces.
Modern disks have more sectors in the outer tracks than the inner ones because the outer radius of the platter is
greater than the inner radius
UNIT-IV 19
Optical Disk: With every new application and software there is greater demand for memory capacity. It is the
necessity to store large volume of data that has led to the development of optical disk storage medium. Optical disks
can be divided into the following categories:
The time required to find an item stored in memory can be reduced considerably if stored data can
be identified for access by the content of the data itself rather than by an address.
This type of memory is accessed simultaneously and in parallel on the basis of data content rather
than by specific address or location.
It consists of a memory array and logic form words with n bits per word.
The argument register A and key register K each have n bits, one for each bit of a word.
The match register M has m bits, one for each memory word.
Each word in memory is compared in parallel with the content of the argument register.
The words that match the bits of the argument register set a corresponding bit in the match
register.
After the matching process, those bits in the match register that have been set indicate the fact that
their corresponding words have been matched.
Reading is accomplished by a sequential access to memory for those words whose corresponding
bits in the match register have been set.
Hardware Organization
The key register provides a mask for choosing a particular field or key in the argument word.
The entire argument is compared with each memory word if the key register contains all 1's.
UNIT-IV 20
Otherwise, only those bits in the argument that have 1st in their corresponding position of the key
register are compared.
Thus the key provides a mask or identifying piece of information which specifies how the
reference to memory is made.
To illustrate with a numerical example, suppose that the argument register A and the key register
K have the bit configuration shown below.
Only the three leftmost bits of A are compared with memory words because K has 1's in these
position.
A 101 111100
K 111 000000
Word 2 matches the unmasked argument field because the three leftmost bits of the argument
and the word are equal.
The relation between the memory array and external registers in an associative memory is shown
in figure 9.4.
The cells in the array are marked by the letter C with two subscripts.
The first subscript gives the word number and the second specifies the bit position in the word.
A bit Aj in the argument register is compared with all the bits in column j of the array provided
that Kj =1.
If a match occurs between all the unmasked bits of the argument and the bits in word i, the
corresponding bit Mi in the match register is set to 1.
If one or more unmasked bits of the argument and the word do not match, Mi is cleared to 0.
Cache Memory :
Cache is a fast small capacity memory that should hold those information which are most
likely to be accessed.
UNIT-IV 21
The basic operation of the cache is, when the CPU needs to access memory, the cache is
examined.
If the word is found in the cache, it is read from the fast memory. If the word addressed by the
CPU is not found in the cache, the main memory is accessed to read the word.
The transformation of data from main memory to cache memory is referred to as a mapping
process.
Associative mapping
Consider the main memory can store 32K words of 12 bits each.
The cache is capable of storing 512 of these words at any given time.
For every word stored in cache, there is a duplicate copy in main memory.
The CPU communicates with both memories.
It first sends a 15-bit address to cache. If there is a hit, the CPU accepts the 12-bit data from cache,
if there is miss, the CPU reads the word from main memory and the word is then transferred to
cache.
The associative memory stores both the address and content (data) of the memory word.
This permits any location in cache to store any word from main memory.
The figure 9.5 shows three words presently stored in the cache. The address value of 15 bits is
shown as a five-digit octal number and its corresponding 12-bit word is shown as a four-digit octal
number.
A CPU address of 15 bits is placed in the argument register and the associative memory is
searched for a matching address.
If the address is found the corresponding 12-bit data is read and sent to CPU.
If no match occurs, the main memory is accessed for the word.
The address data pairs then transferred to the associative cache memory.
If the cache is full, an address data pair must be displaced to make room for a pair that is needed
and not presently in the cache.
The nine least significant bits constitute the index field and the remaining six bits from the tag
field.
The figure 9.6 shows that main memory needs an address that includes both the tag and the index.
UNIT-IV 22
Figure 9.6: Addressing relationships between main and cache memories
The number of bits in the index field is equal to the number of address bits required to access the
cache memory.
The internal organization of the words in the cache memory is as shown in figure 9.7.
Each word in cache consists of the data word and its associated tag.
When a new word is first brought into the cache, the tag bits are stored alongside the data bits.
When the CPU generates a memory request the index field is used for the address to access the
cache.
The tag field of the CPU address is compared with the tag in the word read from the cache.
If the two tags match, there is a hit and the desired data word is in cache.
If there is no match, there is a miss and the required word is read from main memory.
It is then stored in the cache together with the new tag, replacing the previous value.
The word at address zero is presently stored in the cache (index = 000, tag = 00, data = 1220).
Suppose that the CPU now wants to access the word at address 02000.
The index address is 000, so it is used to access the cache. The two tags are then compared.
The cache tag is 00 but the address tag is 02, which does not produce a match.
Therefore, the main memory is accessed and the data word 5670 is transferred to the CPU.
The cache word at index address 000 is then replaced with a tag of 02 and data of 5670.
The disadvantage of direct mapping is that two words with the same index in their address but
with different tag values cannot reside in cache memory at the same time.
UNIT-IV 23
The comparison logic is done by an associative search of the tags in the set similar to an
associative memory search: thus the name "set-associative”.
When a miss occurs in a set-associative cache and the set is full, it is necessary to replace one of
the tag-data items with a new value.
The most common replacement algorithms used are: random replacement, first-in first-out (FIFO),
and least recently used (LRU).
The simplest and most commonly used procedure is to update main memory with every memory
write operation.
The cache memory being updated in parallel if it contains the word at the specified address. This
is called the write-through method.
This method has the advantage that main memory always contains the same data as the cache.
It ensures that the data residing in main memory are valid at all times so that an I/O device
communicating through DMA would receive the most recent updated data.
Write-Back (Copy-Back)
The location is then marked by a flag so that later when the word is removed from the cache it is
copied into main memory.
The reason for the write-back method is that during the time a word resides in the cache, it may be
updated several times.
However, as long as the word remains in the cache, it does not matter whether the copy in main
memory is out of date, since requests from the word are filled from the cache.
It is only when the word is displaced from the cache that an accurate copy need be rewritten into
main memory.
Virtual Memory
Virtual memory is used to give programmers the illusion that they have a very large memory at
their disposal, even though the computer actually has a relatively small main memory.
A virtual memory system provides a mechanism for translating program-generated addresses into
correct main memory locations.
Address space
An address used by a programmer will be called a virtual address, and the set of such addresses is known
as address space.
Memory space
UNIT-IV 24
An address in main memory is called a location or physical address. The set of such locations is called
the memory space.
Program 1
Data 1,1
Data 1,2
Program 2
Data 2,1
UNIT-IV 25
As an illustration, consider a computer with a main-memory capacity of 32K
words (K = 1024). Fifteen bits are needed to specify a physical address in memory
since 32K = 215.
Suppose that the computer has available auxiliary memory for storing 220 = 1024K
words.
Thus auxiliary memory has a capacity for storing information equivalent to the
capacity of 32 main memories.
Denoting the address space by N and the memory space by M, we then have for
this example N = 1024K and M = 32K.
Suppose that program 1 is currently being executed in the CPU. Program 1 and a
portion of its associated data are moved from auxiliary memory into main memory
as shown in figure 9.9.
In our example, the address field of an instruction code will consist of 20 bits but
physical memory addresses must be specified with only 15 bits.
Thus CPU will reference instructions and data with a 20-bit address, but the
information at this address must be taken from physical memory because access to
auxiliary storage for individual words will be too long.
Address mapping using pages.
AThe table implementation of the address mapping is simplified if the information
in the address space and the memory space are each divided into groups of fixed
size.
The physical memory is broken down into groups of equal size called blocks,
which may range from 64 to 4096 words each.
The term page refers to groups of address space of the same size.
Consider a computer with an address space of 8K and a memory space of 4K.
If we split each into groups of 1K words we obtain eight pages and four blocks as
shown in figure 9.9
At any given time, up to four pages of address space may reside in main memory
in any one of the four blocks.
Figure 9.10 Address and Memory space split into group of 1K words
UNIT-IV 26
Figure 9.11: Memory table in paged system
The memory-page table consists of eight words, one for each page.
The address in the page table denotes the page number and the content of the word
give the block number where that page is stored in main memory.
The table shows that pages 1, 2, 5, and 6 are now available in main memory in
blocks 3, 0, 1, and 2, respectively.
A presence bit in each location indicates whether the page has been transferred
from auxiliary memory into main memory.
A 0 in the presence bit indicates that this page is not available in main memory.
The CPU references a word in memory with a virtual address of 13 bits.
The three high-order bits of the virtual address specify a page number and also an
address for the memory-page table.
The content of the word in the memory page table at the page number address is
read out into the memory table buffer register.
If the presence bit is a 1, the block number thus read is transferred to the two high-
order bits of the main memory address register.
The line number from the virtual address is transferred into the 10 low-order bits
of the memory address register.
A read signal to main memory transfers the content of the word to the main
memory buffer register ready to be used by the CPU.
If the presence bit in the word read from the page table is 0, it signifies that the
content of the word referenced by the virtual address does not reside in main
memory.
Segment
UNIT-IV 27
A segment is a set of logically related instructions or data elements associated with
a given name.
Logical address
The length of each segment is allowed to grow and contract according to the needs
of the program being executed. Consider logical address shown in figure 9.12.
Figure 9.12: Logical to physical address mapping
The page field specifies the page within the segment and word field gives specific
word within the page.
Thus the length of a segment would vary according to the number of pages that are
assigned to it.
The mapping of the logical address into a physical address is done by means of
two tables, as shown in figure 9.12.
The segment number of the logical address specifies the address for the segment
table.
The entry in the segment table is a pointer address for a page table base.
The page table base is added to the page number given in the logical address.
UNIT-IV 28
The concatenation of the block field with the word field produces the final
physical mapped address.
The two mapping tables may be stored in two separate small memories or in main
memory.
In either case, memory reference from the CPU will require three accesses to
memory: one from the segment table, one from the page table and the third from
main memory.
This would slow the system significantly when compared to a conventional system
that requires only one reference to memory.
REFERENCE :
1. COMPUTER SYSTEM ARCHITECTURE , MORRIS M. MANO, 3RD EDITION, PRENTICE HALL INDIA.
UNIT-IV 29