Computer Architecture: Optional Homework Set: Black Board Due Date: Hard Copy Due Date

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Name:

Student ID:

Computer Architecture: Optional Homework Set


Black Board due date:

Monday April 27th, at Midnight.

Hard Copy due date:

Tuesday April 28th, during Class.

Exercise 1: (50 Points)


Patterson and Hennessy 5th Edition 5.2
Caches are important to providing a high-performance memory hierarchy to processors. Below
is a list of 32-bit memory address references, given as word addresses.
3, 180, 43, 2, 191, 88, 190, 14, 181, 44, 186, 253
1. For each of these references, identify the binary address, the tag, and the index given a
direct-mapped cache with 16 one-word blocks. Also list if each reference is a hit or a miss,
assuming the cache is initially empty.

Result

Name:
Student ID:

2. For each of these references, identify the binary address, the tag, and the index given a
direct-mapped cache with two-word blocks and a total size of 8 blocks. Also list if each
reference is a hit or a miss, assuming the cache is initially empty.

Result

3. You are asked to optimize a cache design for the given references. There are three directmapped cache designs possible, all with a total of 8 words of data: C1 has 1-word blocks,
C2 has 2-word blocks, and C3 has 4-word blocks. In terms of miss rate, which cache design
is the best? If the miss stall time is 25 cycles, and C1 has an access time of 2 cycles, C2
takes 3 cycles, and C3 takes 5 cycles, which is the best cache design?

Result

Name:
Student ID:

There are many different design parameters that are important to a caches overall performance.
Below are listed parameters for different direct-mapped cache designs.
Cache Data Size: 32 KiB
Cache Block Size: 2 words
Cache Access Time: 1 cycle
4. Calculate the total number of bits required for the cache listed above, assuming a 32-bit
address. Given that total size, find the total size of the closest direct-mapped cache with
16-word blocks of equal size or greater. Explain why the second cache, despite its larger
data size, might provide slower performance than the first cache.

Result

5. Generate a series of read requests that have a lower miss rate on a 2 KiB 2-way set
associative cache than the cache listed above. Identify one possible solution that would

Name:
Student ID:

make the cache listed have an equal or lower miss rate than the 2 KiB cache. Discuss the
advantages and disadvantages of such a solution.

Result

6. The formula shown in Section 5.3 shows the typical method to index a direct-mapped
cache, specifically (Block address) modulo (Number of blocks in the cache). Assuming a
32-bit address and 1024 blocks in the cache, consider a different indexing function,
specifically (Block address [31:27] XOR Block address [26:22]). Is it possible to use this to
index a direct-mapped cache? If so, explain why and discuss any changes that might need
to be made to the cache. If it is not possible, explain why.

Result

Name:
Student ID:

Exercise 2: (50 Points)


Patterson and Hennessy 5th Edition 5.5
Media applications that play audio or video files are part of a class of workloads called
"streaming" workloads; i.e., they bring in large amounts of data but do not reuse much of it.
Consider a video streaming workload that accesses a S12 KiB working set sequentially with the
following address stream:
0, 2, 4, 6, 8, 10, 12, 14, 16,
1. Assume a 64 KiB direct-mapped cache with a 32-byte block. What is the miss rate for the
address stream above? How is this miss rate sensitive to the size of the cache or the
working set? How would you categorize the misses this workload is experiencing, based
on the 3C model?

Result

Name:
Student ID:

2. Re-compute the miss rate when the cache block size is 16 bytes, 64 bytes, and 128 bytes.
What kind of locality is this workload exploiting?

Result

3. "Prefetching" is a technique that leverages predictable address patterns to speculatively


bring in additional cache blocks when a particular cache block is accessed. One example
of prefetching is a stream buffer that prefetches sequentially adjacent cache blocks into
a separate buffer when a particular cache block is brought in. If the data is found in the
prefetch buffer, it is considered as a hit and moved into the cache and the next cache
block is prefetched. Assume a two-entry stream buffer and assume that the cache latency
is such that a cache block can be loaded before the computation on the previous cache
block is completed. What is the miss rate for the address stream above?

Result

Name:
Student ID:

Cache block size (B) can affect both miss rate and miss latency. Assuming a 1-CPI machine with
an average of l.3S references (both instruction and data) per instruction, help find the optimal
block size given the following miss rates for various block sizes.
8: 4%

16:3%

32: 2%

64: 1.5%

128:1%

4. What is the optimal block size for a miss latency of 20xB cycles?

Result

5. What is the optimal block size for a miss latency of 24+B cycles?

Result

Name:
Student ID:

6. For constant miss latency, what is the optimal block size?

Result

Problem statements are included for completeness, but to avoid any confusion introduced by
typos, you should double check all data with your version of the book. In case of any discrepancy
between the statements included here and the book, always go with the book version.
Both Black Board Submission and Hard Copy Submission are mandatory. Partial credit will be
given where work is shown. Please ensure that your name is written on all sheets of your hard
copy submission and make sure that it is securely stapled. Blackboard submissions must be pdfs.
If you have more than one file in your submission, you must group them in a zip file. Instructions
on how to make a zip file in Windows are here: http://www.wikihow.com/Make-a-Zip-File.
Assignments will be accepted late at a penalty of 10 percent off per day.

You might also like