Set Associative Mapping: 2 Way Associative Mapping A Given Block Can Be in One of 2 Lines in Only One Set

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

+

Set Associative Mapping

◼ Compromise that exhibits the strengths of both the direct and


associative approaches while reducing their disadvantages

◼ Cache consists of a number of sets

◼ Each set contains a number of lines

◼ A given block maps to any line in a given set

◼ e.g. 2 lines per set


◼ 2 way associative mapping
◼ A given block can be in one of 2 lines in only one set

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


B0 L0

k lines
Lk–1
Cache memory - set 0
Bv–1
First v blocks of
main memory
(equal to number of sets)

Cache memory - set v–1

(a) v associative-mapped caches

B0 L0
one
set

v lines
Bv–1 Lv–1
First v blocks of Cache memory - way 1 Cache memory - way k
main memory
(equal to number of sets)

(b) k direct-mapped caches

Figure 4.13 Mapping From Main Memory to Cache:


k-way Set Associative

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


s+w

Cache Main Memory


Memory Address Tag Data
B0
Tag Set Word
F0
B1
s–d d w F1

Set 0

s–d Fk–1

Fk s+w
Bj

Compare Fk+i Set 1

(hit in cache) F2k–1


1 if match
0 if no match

0 if match
1 if no match
(miss in cache)

Figure 4.14 k-Way Set Associative Cache Organization

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+
Set Associative Mapping Summary
◼ Address length = (s + w) bits

◼ Number of addressable units = 2s+w words or bytes

◼ Block size = line size = 2w words or bytes

◼ Number of blocks in main memory = 2s+w/2w=2s

◼ Number of lines in set = k

◼ Number of sets = v = 2d

◼ Number of lines in cache = m=kv = k * 2d

◼ Size of cache = k * 2d+w words or bytes

◼ Size of tag = (s – d) bits

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Main memory address (binary)
Tag Main Memory Address =
(hex) Tag Set + Word Data
Tag Set Word
000 000000000000000000000000 13579246
000 000000000000000000000100

9 bits 13 bits 2 bits

000 000000001111111111111000
000 000000001111111111111100
Set
Tag Data Number Tag Data
02C 000101100000000000000000 77777777 000 13579246 0000 02C 77777777
02C 000101100000000000000100 11235813 02C 11235813 0001

02C 000101100011001110011100 FEDCBA98 02C FEDCBA98 0CE7

1FF 11223344 1FFE


02C 000101100111111111111100 12345678 02C 12345678 1FFF 1FF 24682468

9 bits 32 bits 9 bits 32 bits


1FF 111111111000000000000000 16 Kline Cache
1FF 111111111000000000000100

1FF 111111111111111111111000 11223344


1FF 111111111111111111111100 24682468

32 bits
Note: Memory address values are
16 MByte Main Memory in binary representation;
other values are in hexadecimal

Figure 4.15 Two-Way Set Associative Mapping Example

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


1.0
0.9
0.8
0.7
Hit ratio

0.6
0.5
0.4
0.3
0.2
0.1
0.0
1k 2k 4k 8k 16k 32k 64k 128k 256k 512k 1M
Cache size (bytes)
direct
2-way
4-way
8-way
16-way

Figure 4.16 Varying Associativity over Cache Size


© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
Replacement Algorithms

◼ Once the cache has been filled, when a new block is brought
into the cache, one of the existing blocks must be replaced

◼ For direct mapping there is only one possible line for any
particular block and no choice is possible

◼ For the associative and set-associative techniques a


replacement algorithm is needed

◼ To achieve high speed, an algorithm must be implemented in


hardware

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+ The most common replacement
algorithms are:
◼ Least recently used (LRU)
◼ Most effective
◼ Replace that block in the set that has been in the cache longest with
no reference to it
◼ Because of its simplicity of implementation, LRU is the most popular
replacement algorithm

◼ First-in-first-out (FIFO)
◼ Replace that block in the set that has been in the cache longest
◼ Easily implemented as a round-robin or circular buffer technique

◼ Least frequently used (LFU)


◼ Replace that block in the set that has experienced the fewest
references
◼ Could be implemented by associating a counter with each line

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Write Policy

When a block that is resident in


There are two problems to
the cache is to be replaced
contend with:
there are two cases to consider:

If the old block in the cache has not been


altered then it may be overwritten with a More than one device may have access to
new block without first writing out the old main memory
block

If at least one write operation has been A more complex problem occurs when
performed on a word in that line of the multiple processors are attached to the
cache then main memory must be same bus and each processor has its own
updated by writing the line of cache out local cache - if a word is altered in one
to the block of memory before bringing cache it could conceivably invalidate a
in the new block word in other caches

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+
Write Through
and Write Back
◼ Write through
◼ Simplest technique
◼ All write operations are made to main memory as well as to the cache
◼ The main disadvantage of this technique is that it generates substantial
memory traffic and may create a bottleneck

◼ Write back
◼ Minimizes memory writes
◼ Updates are made only in the cache
◼ Portions of main memory are invalid and hence accesses by I/O
modules can be allowed only through the cache
◼ This makes for complex circuitry and a potential bottleneck

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


Line Size
When a block of Two specific effects
data is retrieved come into play:
and placed in the • Larger blocks reduce the
cache not only the As the block size number of blocks that fit
desired word but increases more into a cache
also some number useful data are • As a block becomes larger
each additional word is
of adjacent words brought into the farther from the requested
are retrieved cache word

As the block size The hit ratio will


increases the hit begin to decrease
ratio will at first as the block
increase because becomes bigger
of the principle of and the
locality probability of
using the newly
fetched
information
becomes less than
the probability of
reusing the
information that
has to be replaced
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
0.98

0.96

0.94

0.92

0.90
L1 = 16k
Hit ratio

0.88 L1 = 8k

0.86

0.84

0.82

0.80

0.78
1k 2k 4k 8k 16k 32k 64k 128k 256k 512k 1M 2M

L2 Cache size (bytes)

Figure 4.17 Total Hit Ratio (L1 and L2) for 8 Kbyte and 16 Kbyte L1

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.


+ Summary Cache
Memory
Chapter 4
◼ Elements of cache
◼ Computer memory
design
system overview
◼ Cache addresses
◼ Characteristics of
Memory Systems ◼ Cache size

◼ Memory Hierarchy ◼ Mapping function

◼ Cache memory ◼ Replacement algorithms

principles ◼ Write policy


◼ Line size
◼ Number of caches

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

You might also like