William Stallings
Computer Organization
and Architecture
7th Edition
Chapter 4
Cache Memory
Characteristics
• Location
• Capacity
• Unit of transfer
• Access method
• Performance
• Physical type
• Physical characteristics
• Organisation
Location
• CPU
• Internal
• External
Capacity
• Internal/External memory capacity is
typically expressed in terms of bytes.
• Word size
—The natural unit of organisation
—Common word lengths are 8, 16, and 32 bits
• The size of the word is typically equal to the
number of bits used to represent an integer and
to the instruction length.
• Addressable unit
—Smallest location which can be uniquely
addressed
—For A number of bits of an address, there are
N addressable units
2^A =N
Unit of Transfer
• Internal
—Usually governed by data bus width
—For main memory, this is the number of bits
read out of or written into memory at a time.
• External
—Usually a block which is much larger than a
word
Access Methods (1)
• Sequential
—Start at the beginning and read through in
order
—Access time depends on location of data and
previous location
• Direct
— Individual blocks have unique address
— Access is by jumping to vicinity plus sequential search
— Access time depends on location and previous location
— e.g. disk
— Direct memory access (DMA) is a method that allows an input/output
(I/O) device to send or receive data directly to or from the main
memory, bypassing the CPU to speed up memory operations. The
process is managed by a chip known as a DMA controller (DMAC).
Access Methods (2)
• Random
—Individual addresses identify locations exactly
—Access time is independent of location or
previous access
—e.g. RAM
• Associative
—Data is located by a comparison with contents
of a portion of the store
—Access time is independent of location or
previous access
—e.g. cache
Memory Hierarchy
• Registers
—In CPU
• Internal or Main memory
—May include one or more levels of cache
—“RAM”
• External memory
—Backing store
Memory Hierarchy - Diagram
Performance
• Access time
—Time between presenting the address and
getting the valid data
— the time it takes to perform a read or write
operation
Memory Cycle time
—Time may be required for the memory to
“recover” before next access
—Cycle time is access + recovery
Transfer Rate
• Rate at which data can be moved into or
out of a memory unit.
Physical Types
• Semiconductor
—RAM
• Magnetic
—Disk & Tape
• Optical
—CD & DVD
• Others
—Bubble
—Hologram
Physical Characteristics
Magnetic surface memories are non volatile
Semiconductor memories may be either volatile or not.
Volatile
– Information decay
– Information lost on power failure/switch off.
Non Volatile
Information do not decay once recorded,
unless deliberately changed
Organisation
•For random-access memory, the organization is a key
design issue.
•Physical arrangement of bits into words
The Bottom Line
• How much?
—Capacity
• How fast?
—Time is money
• How expensive?
Hierarchy List
• Registers
• L1 Cache
• L2 Cache
• Main memory
• Disk cache
• Disk
• Optical
• Tape
Cache
• Small amount of fast memory
• Sits between normal main memory and
CPU
• May be located on CPU chip or module
Cache/Main Memory Structure
Cache operation – overview
• CPU requests contents of memory location
• Check cache for this data
• If present, get from cache (fast)
• If not present, read required block from
main memory to cache
• Then deliver from cache to CPU
• Cache includes tags to identify which
block of main memory is in each cache
slot
Cache Read Operation - Flowchart
Typical Cache Organization
Cache Design
• Size
• Mapping Function
Size does matter
• Cost
—More cache is expensive
• Speed
—More cache is faster (up to a point)
—Checking cache for data takes time
Comparison of Cache Sizes
Processor Type Year of Introduction L1 cachea L2 cache L3 cache
IBM 360/85 Mainframe 1968 16 to 32 KB — —
PDP-11/70 Minicomputer 1975 1 KB — —
VAX 11/780 Minicomputer 1978 16 KB — —
IBM 3033 Mainframe 1978 64 KB — —
IBM 3090 Mainframe 1985 128 to 256 KB — —
Intel 80486 PC 1989 8 KB — —
Pentium PC 1993 8 KB/8 KB 256 to 512 KB —
PowerPC 601 PC 1993 32 KB — —
PowerPC 620 PC 1996 32 KB/32 KB — —
PowerPC G4 PC/server 1999 32 KB/32 KB 256 KB to 1 MB 2 MB
IBM S/390 G4 Mainframe 1997 32 KB 256 KB 2 MB
IBM S/390 G6 Mainframe 1999 256 KB 8 MB —
Pentium 4 PC/server 2000 8 KB/8 KB 256 KB —
IBM SP High-end server/ 2000 64 KB/32 KB 8 MB —
supercomputer
CRAY MTAb Supercomputer 2000 8 KB 2 MB —
Itanium PC/server 2001 16 KB/16 KB 96 KB 4 MB
SGI Origin 2001 High-end server 2001 32 KB/32 KB 4 MB —
Itanium 2 PC/server 2002 32 KB 256 KB 6 MB
IBM POWER5 High-end server 2003 64 KB 1.9 MB 36 MB
CRAY XD-1 Supercomputer 2004 64 KB/64 KB 1MB —
Mapping Function
• A mapping function is the method used to
locate a memory address within a cache
• It is used when copying a block from main
memory to the cache and it is used again
when trying to retrieve data from the
cache
• There are three kinds of mapping
functions
—Direct
—Associative
—Set Associative
Direct Mapping
• Each block of main memory maps to only
one cache line
—i.e. if a block is in cache, it must be in one
specific place
• Address is in two parts
• Least Significant w bits identify unique
word
• Most Significant s bits specify one memory
block
• The MSBs are split into a cache line field r
and a tag of s-r (most significant)
• The mapping is expressed as
I =j modulo m
• where
• i cache line number
• j main memory block number
• m number of lines in the cache
Cache Constants
•cache size / line size = number of lines
•log2(line size) = bits for offset
•log2(number of lines) = bits for cache
index
•remaining upper bits = tag address bits
Direct Mapping pros & cons
• Simple
• Inexpensive
• Fixed location for given block
Associative Mapping
• A main memory block can load into any
line of cache
• Memory address is interpreted as tag and
word
• Tag uniquely identifies block of memory
• Every line’s tag is examined for a match
Associative mapping summary
• Address length = (s + w) bits
• Number of addressable units = 2s+w words
or bytes
• Block size = line size = 2w words or bytes
• Number of blocks in main memory
• = 2s+ w/2w = 2s
• Size of tag = s bits
Associative Mapping
Advantages/disadvantages
• Cache searching gets slower
• A main memory block can load into any
line of cache( no cache line will be erased
if there are empty lines in cache)
• Hardware implementation gets complicated and
expensive.
Set Associative Mapping
• Cache is divided into a number of sets
• Each set contains a number of lines
• A given block maps to any line in a given
set
—e.g. Block B can be in any line of set i
• e.g. 2 lines per set
—2 way associative mapping
—A given block can be in one of 2 lines in only
one set
Set Associative Mapping Summary
• Address length is s + w bits
• Cache is divided into a number of sets, v
= 2d
• k blocks/lines can be contained within
each set
• k lines in a cache is called a k-way set
associative mapping
• Number of lines in a cache = v•k = k•2d
• Size of tag = (s-d) bits