Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
108 views23 pages

Cache Optimizations

This document discusses techniques for improving cache performance, including reducing the miss rate, miss penalty, and hit time. It focuses on reducing the miss rate through larger caches, higher associativity, and larger block sizes. Miss penalty can be reduced using multi-level caches, victim caches, critical word first, early restart, and merging write buffers. Prioritizing read misses over writes can also help reduce penalty.

Uploaded by

touqir fatima
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
108 views23 pages

Cache Optimizations

This document discusses techniques for improving cache performance, including reducing the miss rate, miss penalty, and hit time. It focuses on reducing the miss rate through larger caches, higher associativity, and larger block sizes. Miss penalty can be reduced using multi-level caches, victim caches, critical word first, early restart, and merging write buffers. Prioritizing read misses over writes can also help reduce penalty.

Uploaded by

touqir fatima
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Cache Optimizations

Muhammad Tahir

Lecture 21

Electrical Engineering Department


University of Engineering and Technology Lahore
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Contents

1 Cache Performance

2 Reducing the Miss Rate

3 Reducing the Miss Penalty

4 Reducing Hit Time

2/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Cache Performance Analysis

• Average memory access time (AMAT) when using cache

AMAT = (1 − MissRate) × HitTime + MissRate × MissTime

• Define MissTime

MissTime = HitTime + MissPenalty

• Updated AMAT

AMAT = HitTime + (MissRate) × (MissPenalty )

3/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Cache Performance Analysis Cont’d

• Hit Time: Time to find the block in the cache and return to
the CPU
• Miss Rate: Number of misses divided by the total number of
memory accesses made by the CPU
• Miss Penalty: Number of additional cycles required upon
encountering a miss to fetch a block from the next level of
memory hierarchy

4/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Cache Performance Analysis Cont’d

• To reduce AMAT, we require


• Hit Time to be low ∼ small and fast cache
• Miss Rate to be low ∼ large and/or smart cache
• Miss Penalty to be low ∼ main memory access time should
be reduced

5/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Reducing the Miss Rate

Larger cache block size


• Advantage
• Reduces compulsory misses by exploiting spatial locality
• Disadvantage
• Increases miss penalty

Choosing the right block size is a complex trade-off

6/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Reducing the Miss Rate Cont’d

Larger cache size


• Advantages
• Reduces capacity misses
• Reduces conflict misses
• Disadvantages
• Larger hit time
• Higher cost, area & power consumption

7/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Reducing the Miss Rate Cont’d

Higher associativity
• Advantages
• Reduces conflict misses
• Disadvantages
• Increases hit time (due to extra hardware)
• Complex design

8/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Reducing the Miss Penalty

• Multi-level caches
• Victim Caches
• Critical Word First and Early Restart
• Merging Write Buffer
• Giving Priority to Read Misses over Write misses OR Reducing
Read Miss Penalty

9/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Multi-level Caches

• Make cache fast (L1) to keep pace with higher clock rate of
the processor
• Make cache large (L2 and L3) to reduce the main memory
accesses
Slow
&
Large
Fast
&
Small

L1 L2 L3 Main
CPU
CACHE CACHE CACHE Memory

10/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Multi-level Caches Cont’d

• First-level caches (applicable to both instruction and data)


• Latency is the most critical parameter
• Smaller with lower associativity
• Tag and data are accessed simultaneously

• Second-level caches
• Designed for better tradeoff between hit rate and access
latency
• Larger size with higher associativity
• Tag and data are accessed sequentially

11/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Victim Caches
• Victim cache (VC) is a small associative back up cache added
to direct mapped caches
• Holds most recently evicted cache lines
• Leads to fast hit time of direct mapped with reduced conflict
misses

Processor
L2 Cache/

L1 Cache
Main memory
RF
Evicted
from L1

Victum Cache Evicted


VC hit
from VC
(missed L1) 

12/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Critical Word First

• Request the required data word first from memory


• Let the processor continue execution while rest of the cache
line is being filled

13/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Early Restart

• Data from (main) memory arrives in order


• Processor resumes execution as soon as the requested word
arrives during block transfer

14/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Write Buffer

• Write-through caches rely on write buffers


• Write-back caches use a simple buffer when performing block
replacement
• Once data and full address are written to the buffer, write is
finished from the processor’s view point
• While the processor continues, the write buffer writes data to
the next level memory in the hierarchy

15/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Merging Write Buffer


• Multi-word writes are usually more efficient
• Need a valid bit per word and requires address checking
• Reduces stalls due to write buffer being full and improves
buffer efficiency

Figure 1: Write buffer merging (Source: Fig. 2.12 [Patterson and Hennessy, 2019]).

16/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Giving Priority to Read Miss over Writes

• Read misses must be handled as soon as possible, as the


processor is stalled waiting for data
• Writes can happen in the background
• Must maintain memory order:
• Load should return value written by most recent store to the
same address
• On a read miss, the write buffer must be checked first

17/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Reducing Hit Time

Small & simple first level cache


• Faster clock & limited power encourage smaller L1 caches
• Lower level of associativity reduces both hit time and power
• Critical timing path in a cache hit is the three-step process
• Accessing tag memory using index field of the address
• Comparing the read tag to the address (tag field)
• Setting the output multiplexer to choose the correct data item

• Simple Case: Use direct mapped cache as they can overlap


the tag check with the transmission of the data, effectively
reducing hit time.

18/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Reducing Hit Time Cont’d

Pipelining Access and Multi-banked Cache


• Advantages
• Pipelining L1 allows a higher clock frequency, but at the cost
of increased latency
• More suited for instruction cache due to better performance of
branch prediction
• Multi-banked cache increases the memory throughput (suitable
for super-scalar processors)

19/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Reducing Hit Time Cont’d

Reducing write hit time


• Writes take two cycles
• One cycle for tag check and second cycle for data write if hit
• Design data cache that can perform write in one cycle, restore
old value if tag does not match
• Pipelined writes: Hold write data in store buffer ahead of
cache, write cache data during next store’s tag check

20/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Suggested Reading

• Read relevant sections of Chapter 5 of


[Patterson and Hennessy, 2021].
• Read Section 2.3 of [Patterson and Hennessy, 2019].

21/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Acknowledgment

• Preparation of this material was partly supported by Lampro


Mellon Pakistan.

22/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

References

Patterson, D. and Hennessy, J. (2021).


Computer Organization and Design RISC-V Edition: The Hardware
Software Interface, 2nd Edition.
Morgan Kaufmann.

Patterson, D. and Hennessy, J. (6th Edition, 2019).


Computer Architecture: A Quantitative Approach.
Morgan Kaufmann.

23/23

You might also like