0% found this document useful (0 votes)

108 views23 pages

Cache Optimizations

This document discusses techniques for improving cache performance, including reducing the miss rate, miss penalty, and hit time. It focuses on reducing the miss rate through larger caches, higher associativity, and larger block sizes. Miss penalty can be reduced using multi-level caches, victim caches, critical word first, early restart, and merging write buffers. Prioritizing read misses over writes can also help reduce penalty.

Uploaded by

touqir fatima

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

108 views23 pages

Cache Optimizations

Uploaded by

touqir fatima

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Cache Optimizations

Muhammad Tahir

Lecture 21

Electrical Engineering Department

University of Engineering and Technology Lahore
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Contents

1 Cache Performance

2 Reducing the Miss Rate

3 Reducing the Miss Penalty

4 Reducing Hit Time

2/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Cache Performance Analysis

• Average memory access time (AMAT) when using cache

AMAT = (1 − MissRate) × HitTime + MissRate × MissTime

• Define MissTime

MissTime = HitTime + MissPenalty

• Updated AMAT

AMAT = HitTime + (MissRate) × (MissPenalty )

3/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Cache Performance Analysis Cont’d

• Hit Time: Time to find the block in the cache and return to
the CPU
• Miss Rate: Number of misses divided by the total number of
memory accesses made by the CPU
• Miss Penalty: Number of additional cycles required upon
encountering a miss to fetch a block from the next level of
memory hierarchy

4/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Cache Performance Analysis Cont’d

• To reduce AMAT, we require

• Hit Time to be low ∼ small and fast cache
• Miss Rate to be low ∼ large and/or smart cache
• Miss Penalty to be low ∼ main memory access time should
be reduced

5/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Reducing the Miss Rate

Larger cache block size

• Advantage
• Reduces compulsory misses by exploiting spatial locality
• Disadvantage
• Increases miss penalty

Choosing the right block size is a complex trade-off

6/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Reducing the Miss Rate Cont’d

Larger cache size

• Advantages
• Reduces capacity misses
• Reduces conflict misses
• Disadvantages
• Larger hit time
• Higher cost, area & power consumption

7/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Reducing the Miss Rate Cont’d

Higher associativity
• Advantages
• Reduces conflict misses
• Disadvantages
• Increases hit time (due to extra hardware)
• Complex design

8/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Reducing the Miss Penalty

• Multi-level caches
• Victim Caches
• Critical Word First and Early Restart
• Merging Write Buffer
• Giving Priority to Read Misses over Write misses OR Reducing
Read Miss Penalty

9/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Multi-level Caches

• Make cache fast (L1) to keep pace with higher clock rate of
the processor
• Make cache large (L2 and L3) to reduce the main memory
accesses
Slow
&
Large
Fast
&
Small

L1 L2 L3 Main
CPU
CACHE CACHE CACHE Memory

10/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Multi-level Caches Cont’d

• First-level caches (applicable to both instruction and data)

• Latency is the most critical parameter
• Smaller with lower associativity
• Tag and data are accessed simultaneously

• Second-level caches
• Designed for better tradeoff between hit rate and access
latency
• Larger size with higher associativity
• Tag and data are accessed sequentially

11/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Victim Caches
• Victim cache (VC) is a small associative back up cache added
to direct mapped caches
• Holds most recently evicted cache lines
• Leads to fast hit time of direct mapped with reduced conflict
misses

Processor
L2 Cache/

L1 Cache
Main memory
RF
Evicted
from L1

Victum Cache Evicted

VC hit
from VC
(missed L1)

12/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Critical Word First

• Request the required data word first from memory

• Let the processor continue execution while rest of the cache
line is being filled

13/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Early Restart

• Data from (main) memory arrives in order

• Processor resumes execution as soon as the requested word
arrives during block transfer

14/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Write Buffer

• Write-through caches rely on write buffers

• Write-back caches use a simple buffer when performing block
replacement
• Once data and full address are written to the buffer, write is
finished from the processor’s view point
• While the processor continues, the write buffer writes data to
the next level memory in the hierarchy

15/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Merging Write Buffer

• Multi-word writes are usually more efficient
• Need a valid bit per word and requires address checking
• Reduces stalls due to write buffer being full and improves
buffer efficiency

Figure 1: Write buffer merging (Source: Fig. 2.12 [Patterson and Hennessy, 2019]).

16/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Giving Priority to Read Miss over Writes

• Read misses must be handled as soon as possible, as the

processor is stalled waiting for data
• Writes can happen in the background
• Must maintain memory order:
• Load should return value written by most recent store to the
same address
• On a read miss, the write buffer must be checked first

17/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Reducing Hit Time

Small & simple first level cache

• Faster clock & limited power encourage smaller L1 caches
• Lower level of associativity reduces both hit time and power
• Critical timing path in a cache hit is the three-step process
• Accessing tag memory using index field of the address
• Comparing the read tag to the address (tag field)
• Setting the output multiplexer to choose the correct data item

• Simple Case: Use direct mapped cache as they can overlap

the tag check with the transmission of the data, effectively
reducing hit time.

18/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Reducing Hit Time Cont’d

Pipelining Access and Multi-banked Cache

• Advantages
• Pipelining L1 allows a higher clock frequency, but at the cost
of increased latency
• More suited for instruction cache due to better performance of
branch prediction
• Multi-banked cache increases the memory throughput (suitable
for super-scalar processors)

19/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Reducing Hit Time Cont’d

Reducing write hit time

• Writes take two cycles
• One cycle for tag check and second cycle for data write if hit
• Design data cache that can perform write in one cycle, restore
old value if tag does not match
• Pipelined writes: Hold write data in store buffer ahead of
cache, write cache data during next store’s tag check

20/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

• Read relevant sections of Chapter 5 of

[Patterson and Hennessy, 2021].
• Read Section 2.3 of [Patterson and Hennessy, 2019].

21/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Acknowledgment

• Preparation of this material was partly supported by Lampro

Mellon Pakistan.

22/23
Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

References

Patterson, D. and Hennessy, J. (2021).

Computer Organization and Design RISC-V Edition: The Hardware
Software Interface, 2nd Edition.
Morgan Kaufmann.

Patterson, D. and Hennessy, J. (6th Edition, 2019).

Computer Architecture: A Quantitative Approach.
Morgan Kaufmann.

23/23

Updated Chapter One - Introduction
No ratings yet
Updated Chapter One - Introduction
24 pages
Basic Operations and Components of A Computer System
No ratings yet
Basic Operations and Components of A Computer System
18 pages
Lecture 6
No ratings yet
Lecture 6
9 pages
Cache Performance Average Memory Access Time
No ratings yet
Cache Performance Average Memory Access Time
23 pages
Basic Computer Quiz Questions With Answers
85% (13)
Basic Computer Quiz Questions With Answers
3 pages
Epson L3250 Brochure PDF
No ratings yet
Epson L3250 Brochure PDF
2 pages
Optimization Techniques in Cache Memory: Lecture 4D
No ratings yet
Optimization Techniques in Cache Memory: Lecture 4D
15 pages
Cache
No ratings yet
Cache
34 pages
Embedded System MCQ
50% (2)
Embedded System MCQ
11 pages
Introduction To Program Logic Formulation
No ratings yet
Introduction To Program Logic Formulation
39 pages
Lecture 7
No ratings yet
Lecture 7
21 pages
CMP3010L09 MemoryII
No ratings yet
CMP3010L09 MemoryII
39 pages
Advanced Cache Optimization Techniques: Lecture 4E
No ratings yet
Advanced Cache Optimization Techniques: Lecture 4E
15 pages
Question: Who Cares About The Memory Hierarchy?: Caches and Memory Systems I
No ratings yet
Question: Who Cares About The Memory Hierarchy?: Caches and Memory Systems I
13 pages
UNIT-IV Memory and I/O
No ratings yet
UNIT-IV Memory and I/O
36 pages
Chapter # 05
No ratings yet
Chapter # 05
42 pages
Unit II
No ratings yet
Unit II
9 pages
Memory & Cache Fundamentals
No ratings yet
Memory & Cache Fundamentals
38 pages
Presentation 1
No ratings yet
Presentation 1
33 pages
Week 13 - Lecture 13 - Memory (Cont)
No ratings yet
Week 13 - Lecture 13 - Memory (Cont)
31 pages
Cache Optimizations
No ratings yet
Cache Optimizations
29 pages
Lec 33
No ratings yet
Lec 33
26 pages
10 Cacheperf
No ratings yet
10 Cacheperf
24 pages
L07 MemoryII
No ratings yet
L07 MemoryII
27 pages
10 Caches
No ratings yet
10 Caches
34 pages
L18 Cache Wrap Up
No ratings yet
L18 Cache Wrap Up
30 pages
Chapter 2 Adv 2007 PPTV 4
No ratings yet
Chapter 2 Adv 2007 PPTV 4
54 pages
Usb Multi
No ratings yet
Usb Multi
2 pages
Lec 34
No ratings yet
Lec 34
26 pages
CS530 Fall2015 Lecture6
No ratings yet
CS530 Fall2015 Lecture6
3 pages
ACA Lecture 27 Cache Optimizations
No ratings yet
ACA Lecture 27 Cache Optimizations
20 pages
Cache Optimization Techniques
No ratings yet
Cache Optimization Techniques
23 pages
Mechatronics Demonstration Kit
No ratings yet
Mechatronics Demonstration Kit
2 pages
Lecture 12: Cache Innovations
No ratings yet
Lecture 12: Cache Innovations
17 pages
COMP 740: Computer Architecture and Implementation: Montek Singh
No ratings yet
COMP 740: Computer Architecture and Implementation: Montek Singh
41 pages
25 e 50 Beb 5 Aad 8 F 60
No ratings yet
25 e 50 Beb 5 Aad 8 F 60
49 pages
Cache Memory Performance
No ratings yet
Cache Memory Performance
10 pages
Improving and Measuring Cache Performance
No ratings yet
Improving and Measuring Cache Performance
8 pages
Cache Misses
No ratings yet
Cache Misses
8 pages
Cache Memory and Optimization Guide
No ratings yet
Cache Memory and Optimization Guide
2 pages
Lec 6
No ratings yet
Lec 6
18 pages
UNIT2 Cahe-Opt
No ratings yet
UNIT2 Cahe-Opt
134 pages
ch2 Appb
No ratings yet
ch2 Appb
58 pages
Computer Organization and Architecture
No ratings yet
Computer Organization and Architecture
12 pages
Cache Optimization Techniques
No ratings yet
Cache Optimization Techniques
4 pages
Memory Hierarchy for Engineers
No ratings yet
Memory Hierarchy for Engineers
15 pages
Advanced Cache Optimizations - : Adapted From Patterson and Hennessey (Morgan Kauffman Pubs)
No ratings yet
Advanced Cache Optimizations - : Adapted From Patterson and Hennessey (Morgan Kauffman Pubs)
12 pages
Ec6009 Advanced Computer Architecture Unit V Memory and I/O: Cache Performance
No ratings yet
Ec6009 Advanced Computer Architecture Unit V Memory and I/O: Cache Performance
16 pages
Lecture 5 Cache Optimization
No ratings yet
Lecture 5 Cache Optimization
25 pages
Improving Cache Performance:: Average Memory Access Time Amat T + Miss Rate X Miss Penalty
No ratings yet
Improving Cache Performance:: Average Memory Access Time Amat T + Miss Rate X Miss Penalty
16 pages
CS 322M Digital Logic & Computer Architecture: Cache Optimization Techniques-II
No ratings yet
CS 322M Digital Logic & Computer Architecture: Cache Optimization Techniques-II
14 pages
Memory Hierarchy - Ways To Reduce Misses: DAP Spr. 98 ©UCB 1
No ratings yet
Memory Hierarchy - Ways To Reduce Misses: DAP Spr. 98 ©UCB 1
23 pages
Electronics Specification
No ratings yet
Electronics Specification
6 pages
A590 Hard Drive Plus User Guide
No ratings yet
A590 Hard Drive Plus User Guide
68 pages
TU0308 Tutorial ARM Cortex-M1 Embedded Processor
No ratings yet
TU0308 Tutorial ARM Cortex-M1 Embedded Processor
57 pages
Advanced Computer Architecture-06CS81-Memory Hierarchy Design
No ratings yet
Advanced Computer Architecture-06CS81-Memory Hierarchy Design
18 pages
l08 Caches 2
No ratings yet
l08 Caches 2
39 pages
Advanced Cache Strategies
No ratings yet
Advanced Cache Strategies
27 pages
Jouppi Improving Direct Mapped Cache Performance
No ratings yet
Jouppi Improving Direct Mapped Cache Performance
10 pages
Cache Impact On Performance: An Example: Assuming The Following Execution and Cache Parameters
No ratings yet
Cache Impact On Performance: An Example: Assuming The Following Execution and Cache Parameters
32 pages
CHAPTER 2 Memory Hierarchy Design & APPENDIX B. Review of Memory Heriarchy
No ratings yet
CHAPTER 2 Memory Hierarchy Design & APPENDIX B. Review of Memory Heriarchy
73 pages
Advanced Cache Optimization Techniques
No ratings yet
Advanced Cache Optimization Techniques
21 pages
Cache Performance Insights
No ratings yet
Cache Performance Insights
17 pages
APDU ReadWriteJava
No ratings yet
APDU ReadWriteJava
4 pages
L7 Single Cycle DP
No ratings yet
L7 Single Cycle DP
24 pages
Lecture: Cache Hierarchies: Topics: Cache Innovations (Sections B.1-B.3, 2.1)
No ratings yet
Lecture: Cache Hierarchies: Topics: Cache Innovations (Sections B.1-B.3, 2.1)
20 pages
Materi CE432 M02 Microcontroller Development Environment
No ratings yet
Materi CE432 M02 Microcontroller Development Environment
15 pages
Memory Hierarchy Design: A Quantitative Approach, Fifth Edition
No ratings yet
Memory Hierarchy Design: A Quantitative Approach, Fifth Edition
37 pages
Lect12 Cache
No ratings yet
Lect12 Cache
39 pages
Computer Architecture and Organization: Lecture15: Cache Performance
No ratings yet
Computer Architecture and Organization: Lecture15: Cache Performance
17 pages
5.2 Eleven Advanced Optimizations of Cache Performance
No ratings yet
5.2 Eleven Advanced Optimizations of Cache Performance
13 pages
Realtime Operating Systems For Embedded Computing
No ratings yet
Realtime Operating Systems For Embedded Computing
5 pages
8051 Microcontroller Instruction Set
No ratings yet
8051 Microcontroller Instruction Set
13 pages
ESP8266 WiFi Board Overview
No ratings yet
ESP8266 WiFi Board Overview
5 pages
Trusti Account
No ratings yet
Trusti Account
3 pages
Cycom
No ratings yet
Cycom
2 pages
Comparative Grammar Worksheet
No ratings yet
Comparative Grammar Worksheet
2 pages
Questions On Triumph of The Nerds & Pirates of Silicon Valley
No ratings yet
Questions On Triumph of The Nerds & Pirates of Silicon Valley
5 pages
Age Class Notes
No ratings yet
Age Class Notes
9 pages
Hewlett Packard PC Networking Alternatives
No ratings yet
Hewlett Packard PC Networking Alternatives
20 pages
Cortexm3-Assembly - Language PDF
No ratings yet
Cortexm3-Assembly - Language PDF
16 pages
How Many Operating Systems Can You Name?: 1. Windows
No ratings yet
How Many Operating Systems Can You Name?: 1. Windows
3 pages
Week 8 Jan 2018
No ratings yet
Week 8 Jan 2018
15 pages
Devlist
No ratings yet
Devlist
8 pages
E-Commerce Website Designing
No ratings yet
E-Commerce Website Designing
7 pages
Z 84 C 0006
No ratings yet
Z 84 C 0006
2 pages
HP AMD Ryzen 7 14 Inch Laptop For 1
No ratings yet
HP AMD Ryzen 7 14 Inch Laptop For 1
1 page

Cache Optimizations

Uploaded by

Cache Optimizations

Uploaded by

Cache Performance Reducing the Miss Rate Reducing the Miss Penalty Reducing Hit Time

Electrical Engineering Department

2 Reducing the Miss Rate

3 Reducing the Miss Penalty

4 Reducing Hit Time

Cache Performance Analysis

• Average memory access time (AMAT) when using cache

AMAT = (1 − MissRate) × HitTime + MissRate × MissTime

MissTime = HitTime + MissPenalty

AMAT = HitTime + (MissRate) × (MissPenalty )

Cache Performance Analysis Cont’d

Cache Performance Analysis Cont’d

• To reduce AMAT, we require

Reducing the Miss Rate

Larger cache block size

Choosing the right block size is a complex trade-off

Reducing the Miss Rate Cont’d

Larger cache size

Reducing the Miss Rate Cont’d

Reducing the Miss Penalty

Multi-level Caches Cont’d

• First-level caches (applicable to both instruction and data)

Victum Cache Evicted

Critical Word First

• Request the required data word first from memory

• Data from (main) memory arrives in order

• Write-through caches rely on write buffers

Merging Write Buffer

Giving Priority to Read Miss over Writes

• Read misses must be handled as soon as possible, as the

Reducing Hit Time

Small & simple first level cache

• Simple Case: Use direct mapped cache as they can overlap

Reducing Hit Time Cont’d

Pipelining Access and Multi-banked Cache

Reducing Hit Time Cont’d

Reducing write hit time

• Read relevant sections of Chapter 5 of

• Preparation of this material was partly supported by Lampro

Patterson, D. and Hennessy, J. (2021).

Patterson, D. and Hennessy, J. (6th Edition, 2019).

You might also like