Unit 4 Memory Hierarchy
Unit 4 Memory Hierarchy
Memory Hierarchy
Memory Hierarchy :
• Device to store content in computer system.
• We don’t use single memory in computer, instead we use a memory hierarchy.
• Why need of memory hierarchy ?
• The reason having 2 or 3 level of memory hierarchy is Performance issues &
economics.
Register , Cache
Internal
RAM, ROM
Main Memory
HDD,
Magnetic Disc Secondary Memory
Tape Drive
Tertiary Memory
Access Frequency of
Time access from CPU
Memory Hierarchy :
• The memory unit that communicates directly with the CPU is called
the Main memory.
Ex – RAM & ROM
• Devices that provide backup storage are called Auxiliary Memory.
Ex – Magnetic disks & Tapes
• Only programs and data currently needed by the processor resides in
main memory.
• Other information stored in auxiliary memory.
• Total memory capacity of computer can be visualised as being a
hierarchy of components.
Memory Hierarchy :
• The slow but high capacity – Auxiliary memory
• To relative faster – Main Memory
• Even Smaller & Faster – Cache memory accessible to the high
processing logic.
Total Cost
Average Per bit Memory Cost =
Total Size
S1*C1+S2*C2+S3*C3
= S1+S2+S3
Memory Representation :
• Memory represent by = No. of Cell * 1 Cell Capacity
0
Memory 1) How many Cell you have?
1 Cell
2 2) Capacity of Each Cell?
3 3) What is the total capacity?
4
5 Total = 8*8bits = 64-bit storage capacity
6
7 Each Cell Capacity
is Same
8-Bits
Capacity
Memory Representation :
• How to access each & every cell uniquely?
• CPU wants to perform some operation on this chip how to identify
uniquely any cell.
Address
No Cell No 8- bits This cell no is called address & write in binary number
00 0 format.
01 1
10 2 Using this address Cell is identify uniquely.
11 3
Total = 32-bit Storage Capacity
Memory Representation :
Address
No Cell No 16-bits
000 0 8 cell – 8 different address location
001 1
010 2
011 3 Total Memory = No. of Cell * 1 Cell Capacity
100 4 = No of Memory * Bits per
101 5 Location Location
110 6
111 7
If the active portions of the program and data are placed in a fast
small memory, the average memory access time can be reduced.
Cache Memory :
• Use of cache reduces average memory access time.
• If the active portions of the program and data are placed in a fast small memory,
the average memory access time can be reduced, thus reducing the total
execution time of the program. Such a fast small memory is referred to as a cache
memory.
• Cache is a fast small capacity memory that should hold those information which
are most likely to be accessed.
• The basic operation of the cache is, when the CPU needs to access memory, the
cache is examined.
• If the word is found in the cache, it is read from the fast memory. If the word
addressed by the CPU is not found in the cache, the main memory is accessed to
read the word.
• The performance of cache memory is frequently measured in term of a
quantity called hit ratio.
• When the CPU refers to memory and finds the word in cache, it is said to
produce a hit.
• If the word not found in cache, it is in main memory and it count as a miss.
• The ratio of the number of hits divided by the total references to memory
(hits plus misses)is the hit ratio (H).
= 28 ns
Tavg = Average memory access time
Types of Cache Accesses: H = Cache hit ratio
M = Cache miss ratio
tcm = cache memory access time
1. Simultaneous Access (Parallel)
tmm = main memory access time
MM MM =128K = 17 bits
MM MM = 35 bits
MM = 2 34 = 34 bits = 16 GB
MM
MM = 2 26 = 26 bits = 64MB
MM
• This direct-mapping example just described uses a block size of one word.
• The same organization but using a block size of 8 words shown in next figure.
Cache size – 512
• The tag field stored within cache is common to all 8 words of the same block.
• Every time a miss occurs, an entire block of 8 words must be transferred from main
memory to cache memory.
• Although this take extra time, the hit ratio will most likely improve with a larger block
size because of the sequential nature of computer programs.
CM 7 4
Discuss direct cache mapping technique.
Complete missing parameter in below table using direct cache mapping technique, assume memory is byte addressable.
MM Size Cache Size Block Size Tag Size Tag directory Comparator
Size
128 KB 16KB 256 B ----- -----
32 GB 32 KB 1 KB ----- -----
----- 512 KB 1 KB 17 BITS -----
16 GB ----- 4 KB ----- -----
64 MB ----- 64 KB ----- -----
----- 512 KB ----- 7 bits -----
MM = Physical Address
MM Size Cache Size Block Size Tag Size Tag directory Comparator
Size
128 KB 16KB 256 B 9 bits (9 * 64 ) bits No
32 GB 32 KB 1 KB 25 bits (25 * 32 ) bits
128 MB 512 KB 1 KB 17 BITS (17 * 512 ) bits
16 GB 4 KB 22 bits Can not get
64 MB 64 KB 10 bits Can not get
512 KB 7 bits
MM = Physical Address
• The tag field of the CPU address is then compared with both tags in the cache to
determine if a match occurs.
• The comparison logic is done by an associative search of the tags in the set similar
to an associative memory search: thus the name "set-associative”.
• When a miss occurs in a set-associative cache and the set is full, it is necessary to
replace one of the tag-data items with a new value.
• The most common replacement algorithms used are: random replacement, first-
in first out (FIFO), and least recently used (LRU).
Set – Associative Mapping Example:
MM Size Cache Size Block Size Tag Size Tag Set – Comparator
directory Associative
Size way
128KB 16KB 256 KB 4 (4* 64) bits 2-way 2
MM
4 –way set associative mem means no of
Block No line in each set is 4. (4 line in each set)
Block Offset
Tag Set No
Set – Associative Mapping Example:
MM Size Cache Size Block Size Tag Size Tag Set – Comparator
directory Associative
Size way
128KB 16KB 256 B 4 (4* 64) bits 2-way 2 * 4 bits
MM = PA = 17-bits
Cache size = 2 14
Lines= Block size = 2 8
= 2 6 = 64
Set – Associative Mapping Example:
MM Size Cache Size Block Size Tag Size Tag directory Set – Comparator
Size Associative
way
32 GB 32KB 1 KB 22 (22* 32) bits 4-way 4* 22 bits
MM = PA = 35-bits
Cache size = 2 15
Lines= Block size = 2 10
= 2 5 = 32
Set – Associative Mapping Example:
MM Size Cache Size Block Size Tag Size Tag directory Set – Comparator
Size Associative
way
8 MB 512 KB 1 KB 7 (7* 512) bits 8-way 8* 7 bits
MM = PA = 7+6+10 = 23-bits = 8 MB
Cache size = 2 19
Lines= Block size = 2 10
= 2 9 = 512
Set – Associative Mapping Example:
MM Size Cache Size Block Size Tag Size Tag directory Set – Comparator
Size Associative
way
16 GB 64 MB 4 KB 10 (10* 4 KB) bits 4-way 4* 10 bits
MM = PA = 34 -bits
MM = PA = 26 -bits
MM =
• The simplest and most commonly used procedure is to update main memory with
every memory write operation.
• The cache memory being updated in parallel if it contains the word at the specified
address. This is called the write-through method.
• This method has the advantage that main memory always contains the same data as
the cache.
• This characteristic is important in systems with direct memory access(DMA) transfers.
• It ensures that the data residing in main memory are valid at all times so that an I/O
device communicating through DMA would receive the most recent updated data.
Writing into Cache :
• 2 Types → (1) Write Through (2) Write-Back (Copy-Back)
This algorithm stands for "Least recent used" and this algorithm helps the OS to
search those pages that are used over a short duration of time frame.
•The page that has not been used for the longest time in the main memory will
be selected for replacement.
•This algorithm is easy to implement.
•This algorithm makes use of the counter along with the even-page.
Advantages of LRU
•It is an efficient technique.
•With this algorithm, it becomes easy to identify the faulty pages that are not
needed for a long time.
•It helps in Full analysis.
Disadvantages of LRU
•It is expensive and has more complexity.
•There is a need for an additional data structure.
Optimal Page Replacement Algorithm
This algorithm mainly replaces the page that will not be used for the longest time in the
future. The practical implementation of this algorithm is not possible.
•Practical implementation is not possible because we cannot predict in advance those pages
that will not be used for the longest time in the future.
•This algorithm leads to less number of page faults and thus is the best-known algorithm
Also, this algorithm can be used to measure the performance of other algorithms.
Advantages of OPR
•This algorithm is easy to use.
•This algorithm provides excellent efficiency and is less complex.
•For the best result, the implementation of data structures is very easy
Disadvantages of OPR
•In this algorithm future awareness of the program is needed.
•Practical Implementation is not possible because the operating system is unable to track the
future request
Memory Interleaving : Memory Module
• How it is done?
MAR MDR
Memory Module
Advantage :
System performance is enhanced because read & write activity occurs
simultaneously across the multiple modules in a similar fashion.
• There are 2 address formats for memory interleaving the address space :
1. High-order Interleaving :
It uses the high-order bits as module address & the lower-order bits as the word
address within each module.
H (MSB) L (LSB) MM : 00
00 10
01 20
10 30
11 40
2. Low-order Interleaving :
It uses the lower-order bits as module address & the high-order bits as the word
address within each module.