Name: Lê Minh Quang
Class: A4
Student ID: 2410814
Chapter 5
Exercise 1:
a. Each 64-bit integer is 8 bytes. A 16-byte cache block can store 16 / 8 = 2 integers.
-> 2 64-bit integers per block.
b. Temporal locality is observed when the same memory location is accessed repeatedly in a
short span. In the C code, B[I][0] is accessed repeatedly inside the inner loop for fixed I.
-> B[I][0] exhibits temporal locality.
c. Spatial locality is observed when nearby memory locations are accessed. Since A[I][J]
accesses all elements of a row sequentially (J from 0 to 7999), it has spatial locality.
-> A[I][J] exhibits spatial locality.
d. In MATLAB, matrices are column-major. B(I,0) is accessed repeatedly for fixed I, showing
temporal locality.
-> B(I,0) exhibits temporal locality.
e. A(J,I) accesses elements down a column in MATLAB (column-major), so it has spatial
locality.
-> A(J,I) exhibits spatial locality.
f. Total elements = 8x8000 (A) + 8 (B) = 64008 64-bit integers. Each 16-byte cache block
holds 2 integers. Blocks needed = 64008 / 2 = 32004 blocks.
-> 32004 cache blocks in both MATLAB and C.
Exercise 2:
a. Direct-mapped cache with 16 one-word blocks:
Each 64-bit address maps to a block via the 4 LSBs. Since all addresses are unique and cache
is initially empty, all accesses result in misses.
b. Cache with 2-word blocks and 8 blocks total:
Offset = 1 bit, Index = 3 bits, rest is tag. There are 4 hits out of 12 accesses (at addresses
0x02, 0xbe, 0xb5, 0xfd).
c. Cache designs (8-word total):
C1 (1-word): high flexibility but fewer hits on sequential.
C2 (2-word): balance.
C3 (4-word): best for sequential access.
-> C2 is often the best compromise.
Exercise 3:
a. Cache block size is 16 bytes = 4 words.
Offset = 4 bits.
b. Cache has 64 blocks.
Index = 6 bits.
Tag = 22 bits (from 32 - 6 - 4).
c. Tag = upper 22 bits
Index = 6 bits
Offset = 4 bits
d. Hit ratio = hits / total accesses = 6 / 11 = 54.5%
e. Final cache state: <0, 0x34, Mem[0xD00–0xD0F]>, <2, 0x30, Mem[0xC20–0xC2F]>
Exercise 4:
a. Buffers between caches:
- Between L1 and L2: write buffer, read buffer
- Between L2 and memory: write buffer
b. L1 write-miss handling:
1. Check L2.
2. If dirty, write back to L2.
3. Bring block from L2.
4. Update LRU.
5. Write data.
c. Exclusive caches:
L1 write-miss: fetch from L2, move to L1. Evicted dirty blocks in L1 go to L2.
L1 read-miss: move block from L2 to L1, evict L2 copy.
Exercise 5:
a. 64 KiB cache, 32-byte blocks, 512 KiB working set:
Total blocks = 16384, Cache blocks = 2048 → All accesses are misses (no reuse).
-> 100% miss rate. Misses are compulsory + capacity (3C model).
b. Varying block sizes:
Smaller blocks → more misses. Larger blocks → more spatial locality reuse.
-> Workload exploits spatial locality.
c. Prefetching using stream buffer (2-entry):
Miss only on every 3rd access due to prefetching → Approx. 1/3 miss rate.
Exercise 6:
a. TLB hits: 0, Page Table Hits: 0, Page Faults: 10 (every page is new -> first time seen)
b. Using 16 KiB pages: fewer TLB misses, fewer entries, less overhead.
Trade-off: more internal fragmentation.
c. With 2-way set associative TLB: TLB miss rate will improve slightly over direct-mapped,
due to fewer conflicts.
d. With direct-mapped TLB: only one entry per set → more conflict misses.
-> Least flexible, highest miss rate
e. TLB is needed to avoid expensive page table lookups for every memory access. Without
TLB: very slow.
Exercise 7:
a. LRU policy: tracks least recently used, ensures optimal recent reuse. Number of hits
increases after first few accesses.
Total hits: 3 (steps 6,7,8)
Total misses: 17
Hit rate: 3/20= 15%
b. MRU policy: evicts most recently used → worse performance than LRU.
Hits :3
Misses:17
c. Random policy: evicts randomly. Results vary, simulate with coin flip.
Expected hits: 2-3
d. Optimal replacement: evict the block not needed for the longest future time. Requires
future knowledge (impractical).
Hits:0
e. OPT is hard to implement because it requires perfect knowledge of future references.
f. Deciding whether to cache or not can save space for reusable data, lowering the miss rate.