Operating
Systems:
Internals Chapter 1
and Design
Principles Computer System
Overview
Eighth Edition
By William Stallings
Operating System
Exploits
the hardware resources of one or more
processors
Provides a set of services to system users
Manages secondary memory and I/O devices
Basic Elements
I/O
Processor Modules
Main System
Memory Bus
Processor
Controls the Performs the
operation of the data processing
computer functions
Referred to as
the Central
Processing Unit
(CPU)
Main Memory
Volatile
Contentsof the memory is lost
when the computer is shut down
Referred
to as real memory or
primary memory
I/O Modules
storage (e.g. hard
drive)
Moves data
between the
computer and communications
external equipment
environments
such as:
terminals
System Bus
Provides for
communication among
processors, main memory,
and I/O modules
CPU Main Memory
0
System 1
2
PC MAR Bus
Instruction
Instruction
Instruction
IR MBR
I/O AR
Data
Execution
unit Data
I/O BR Data
Data
I/O Module n-2
n-1
PC = Program counter
Buffers IR = Instruction register
MAR = Memory address register
MBR = Memory buffer register
I/O AR = Input/output address register
I/O BR = Input/output buffer register
Figure 1.1 Computer Components: Top-Level View
Microprocessor
Invention that brought about desktop
and handheld computing
Processor on a single chip
Fastest general purpose processor
Multiprocessors
Each chip (socket) contains multiple
processors (cores)
Graphical Processing
Units (GPU’s)
Provide efficient computation on arrays
of data using Single-Instruction Multiple
Data (SIMD) techniques
Used for general numerical processing
Physics simulations for games
Computations on large spreadsheets
Digital Signal Processors
(DSPs)
Deal with streaming signals such as
audio or video
Used to be embedded in devices like
modems
Encoding/decoding speech and video
(codecs)
Support for encryption and security
System on a Chip
(SoC)
To satisfy the requirements of handheld
devices, the microprocessor is giving way
to the SoC
Components such as DSPs, GPUs,
codecs and main memory, in
addition to the CPUs and caches,
are on the same chip
Instruction Execution
A program consists of a set of instructions
stored in memory
processor reads
processor executes
(fetches) instructions
each instruction
from memory
Two steps
Fetch Stage Execute Stage
Fetch Next Execute
START HALT
Instruction Instruction
Figure 1.2 Basic Instruction Cycle
The
processor fetches the instruction from
memory
Program counter (PC) holds address of the
instruction to be fetched next
PC is incremented after each fetch
Instruction Register (IR)
Fetched instruction is Processor interprets the
loaded into Instruction instruction and performs
Register (IR) required action:
Processor-memory
Processor-I/O
Data processing
Control
Fetch Stage Execute Stage
Memory CPU Registers Memory CPU Registers
300 1 9 4 0 3 0 0 PC 300 1 9 4 0 3 0 1 PC
301 5 9 4 1 AC 301 5 9 4 1 0 0 0 3 AC
302 2 9 4 1 1 9 4 0 IR 302 2 9 4 1 1 9 4 0 IR
• •
• •
940 0 0 0 3 940 0 0 0 3
941 0 0 0 2 941 0 0 0 2
Step 1 Step 2
Memory CPU Registers Memory CPU Registers
300 1 9 4 0 3 0 1 PC 300 1 9 4 0 3 0 2 PC
301 5 9 4 1 0 0 0 3 AC 301 5 9 4 1 0 0 0 5 AC
302 2 9 4 1 5 9 4 1 IR 302 2 9 4 1 5 9 4 1 IR
• •
• •
940 0 0 0 3 940 0 0 0 3 3+2=5
941 0 0 0 2 941 0 0 0 2
Step 3 Step 4
Memory CPU Registers Memory CPU Registers
300 1 9 4 0 3 0 2 PC 300 1 9 4 0 3 0 3 PC
301 5 9 4 1 0 0 0 5 AC 301 5 9 4 1 0 0 0 5 AC
302 2 9 4 1 2 9 4 1 IR 302 2 9 4 1 2 9 4 1 IR
• •
• •
940 0 0 0 3 940 0 0 0 3
941 0 0 0 2 941 0 0 0 5
Step 5 Step 6
Figure 1.4 Example of Program Execution
(contents of memory and registers in hexadecimal)
Interrupts
Interrupt the normal sequencing of the
processor
Provided to improve processor utilization
most I/O devices are slower than the processor
processor must pause to wait for device
wasteful use of the processor
Table 1.1 Classes of Interrupts
Program Generated by some condition that occurs as a result of an
instruction execution, such as arithmetic overflow, division
by zero, attempt to execute an illegal machine instruction,
and reference outside a user's allowed memory space.
Timer Generated by a timer within the processor. This allows the
operating system to perform certain functions on a regular
basis.
I/O Generated by an I/O controller, to signal normal
completion of an operation or to signal a variety of error
conditions.
Hardware Generated by a failure, such as power failure or memory
failure parity error.
User I/O User
Program Program Program
1 4 1
I/O
Command
Figure 1.5a WRITE
5
WRITE
2a
END
2
Flow of Control 2b
WRITE WRITE
Without 3a
Interrupts 3
3b
WRITE WRITE
(a) No interrupts (b) Inter
User I/O User I/O User
Program Program Program Program Program
1 4 1 4 1
I/O I/O
Command Command
WRITE WRITE WRITE
5
2a
END
Figure 1.5b
2
Interrupt
2
2b Handler
WRITE WRITE 5 WRITE
Short I/O Wait 3a
END
3 3
3b
WRITE WRITE WRITE
(a) No interrupts (b) Interrupts; short I/O wait (c) In
I/O User I/O User I/O
Program Program Program Program Program
4 1 4 1 4
I/O I/O I/O
Command Command Command
WRITE WRITE
5
Figure 1.5c
2a
END
2
Interrupt Interrupt
2b Handler Handler
Long I/O Wait
WRITE 5 WRITE 5
END END
3a
3b
WRITE WRITE
No interrupts (b) Interrupts; short I/O wait (c) Interrupts; long I/O wait
User Program Interrupt Handler
i
Interrupt
occurs here i+1
Figure 1.6 Transfer of Control via Interrupts
Fetch Stage Execute Stage Interrupt Stage
Interrupts
Disabled
Check for
Fetch next Execute interrupt;
START instruction instruction initiate interrupt
Interrupts
handler
Enabled
HALT
Figure 1.7 Instruction Cycle with Interrupts
Time
1 1
4 4
I/O operation
I/O operation;
processor waits 2a concurrent with
processor executing
5 5
2b
2
4
I/O operation
4 3a concurrent with
processor executing
I/O operation;
processor waits 5
5 3b
(b) With interrupts
3
(a) Without interrupts
Figure 1.8 Program Timing: Short I/O Wait
Time
1 1
4 4
I/O operation; 2 I/O operation
processor waits concurrent with
processor executing;
then processor
waits
5
5
2
4
4
3 I/O operation
concurrent with
I/O operation; processor executing;
processor waits then processor
waits
5
5
3 (b) With interrupts
(a) Without interrupts
Figure 1.9 Program Timing: Long I/O Wait
Hardware Software
Device controller or
other system hardware
issues an interrupt
Save remainder of
process state
information
Processor finishes
execution of current
instruction
Process interrupt
Processor signals
acknowledgment
of interrupt
Restore process state
information
Processor pushes PSW
and PC onto control
stack
Restore old PSW
and PC
Processor loads new
PC value based on
interrupt
Figure 1.10 Simple Interrupt Processing
T–M T–M
Y N+1
Control Control
Stack Stack
T T
N+1 Y+L+1
Program Program
Counter Counter
Y Start Y Start
Interrupt General Interrupt General
Service Registers Service Registers
Y + L Return Routine T Y + L Return Routine T–M
Stack Stack
Pointer Pointer
Processor Processor
T–M T
N User's N User's
N+1 N+1
Program Program
Main Main
Memory Memory
(a) Interrupt occurs after instruction
(b) Return from interrupt
at location N
Figure 1.11 Changes in Memory and Registers for an Interrupt
Multiple Interrupts
An interrupt occurs
while another interrupt Two approaches:
is being processed
• e.g. receiving data from • disable interrupts while
a communications line an interrupt is being
and printing results at processed
the same time • use a priority scheme
Interrupt
User Program Handler X
Interrupt
Handler Y
(a) Sequential interrupt processing
Interrupt
User Program Handler X
Interrupt
Handler Y
(b) Nested interrupt processing
Figure 1.12 Transfer of Control with Multiple Interrupts
Printer Communication
User Program
interrupt service routine interrupt service routine
t=0
15
0 t=
t =1
t = 25
t= t = 25 Disk
40 interrupt service routine
t=
35
Figure 1.13 Example Time Sequence of Multiple Interrupts
Memory Hierarchy
Major constraints in memory
amount
speed
expense
Memory must be able to keep up with the processor
Cost of memory must be reasonable in relationship
to the other components
Memory Relationships
Greater capacity
Faster = smaller cost per
access time bit
= greater Greater
cost per bit capacity =
slower access
speed
The Memory Hierarchy
Going down the Inb
g-
Re r s
i st
e
e
hierarchy:
ch
Me o a r d Ca
mo in
ry M a or y
m
Me
sk
Ou Di
t tic
Sto boar ne OM
decreasing cost per bit
g
ra g d M a D- R W
e C D -R W
C
D-
R M
D V D- R A y
a
DV lu-R
increasing capacity B
increasing access time
Of p e
Ta
S t o f - li n e
rag
e M ag
ne
tic
decreasing frequency of
access to the memory by
the processor
Figure 1.14 The Memory Hierarchy
T1 + T2
T2
Average access time
T1
0 1
Fraction of accesses involving only Level 1 (Hit ratio)
Figure 1.15 Performance of a Simple Two-Level Memory
Memory references by the processor tend to
cluster
Datais organized so that the percentage of
accesses to each successively lower level is
substantially less than that of the level above
Can
be applied across more than two levels of
memory
Secondary
Memory
Also referred to
as auxiliary
memory
• external
• nonvolatile
• used to store
program and data
files
Invisible to the OS
Interacts with other memory management hardware
Processor must access memory at least once per instruction
cycle
Processor execution is limited by memory cycle time
Exploit the principle of locality with a small, fast memory
Block Transfer
Word Transfer
CPU Cache Main Memory
Fast Slow
(a) Single cache
Level 1 Level 2 Level 3 Main
CPU
(L1) cache (L2) cache (L3) cache Memory
Fastest Fast
Less Slow
fast
(b) Three-level cache organization
Figure 1.16 Cache and Main Memory
Line Memory
Number Tag Block address
0 0
1 1
2 2 Block 0
3 (K words)
C-1
Block Length
(K Words)
(a) Cache
Block M – 1
2n - 1
Word
Length
(b) Main memory
Figure 1.17 Cache/Main-Memory Structure
START
RA - read address
Receive address
RA from CPU
Is block No Access main
containing RA memory for block
in cache? containing RA
Yes
Fetch RA word Allocate cache
and deliver slot for main
to CPU memory block
Load main
Deliver RA word
memory block
to CPU
into cache slot
DONE
Figure 1.18 Cache Read Operation
cache size
number of
cache block size
levels
Main
categories
are:
write mapping
policy function
replacement
algorithm
Cache and Block Size
Cache Size
Block
Size
the unit of data
small caches have
exchanged
significant impact
between cache and
on performance
main memory
Mapping Function
∗ Determines which cache
location the block will occupy
when one block is read in,
another may have to be
replaced
Two constraints affect
design:
the more flexible the
mapping function, the
more complex is the
circuitry required to
search the cache
Replacement Algorithm
Least Recently Used (LRU) Algorithm
effective strategy is to replace a block that has been
in the cache the longest with no references to it
hardware mechanisms are needed to identify the
least recently used block
chooses which block to replace when a new block is
to be loaded into the cache
Write Policy
Dictates when the memory write operation takes
place
• can occur every time the block is updated
• can occur when the block is replaced
• minimizes write operations
• leaves main memory in an obsolete state
I/O Techniques
∗ When the processor encounters an instruction relating
to I/O, it executes that instruction by issuing a command
to the appropriate I/O module
Three techniques are possible for I/O operations:
Programmed Interrupt- Direct Memory
I/O Driven I/O Access (DMA)
Programmed I/O
The I/O module performs the requested action
then sets the appropriate bits in the I/O status
register
Theprocessor periodically checks the status of the
I/O module until it determines the instruction is
complete
With programmed I/O the performance level of
the entire system is severely degraded
Interrupt-Driven I/O
Processor
issues an I/O The processor
command to a executes the
module and data transfer
then goes on and then
to do some resumes its
other useful former
work processing
The I/O module will More efficient than
then interrupt the Programmed I/O but
processor to request still requires active
service when it is intervention of the
ready to exchange processor to transfer
data with the data between memory
processor and an I/O module
Interrupt-Driven I/O
Drawbacks
Transfer
rate is limited by the speed with
which the processor can test and service a
device
The processor is tied up in managing an I/O
transfer
a number of instructions must be
executed for each I/O transfer
Direct Memory Access
(DMA)
∗ Performed by a separate module on the system bus or
incorporated into an I/O module
When the processor wishes to read or write data it
issues a command to the DMA module containing:
• whether a read or write is requested
• the address of the I/O device involved
• the starting location in memory to read/write
• the number of words to be read/written
Transfersthe entire block of data directly to
and from memory without going through the
processor
processor is involved only at the beginning and end of the
transfer
processor executes more slowly during a transfer when
processor access to the bus is required
More efficient than interrupt-driven or
programmed I/O
Symmetric Multiprocessors
(SMP)
A stand-alone computer system with the
following characteristics:
two or more similar processors of comparable capability
processors share the same main memory and are
interconnected by a bus or other internal connection scheme
processors share access to I/O devices
all processors can perform the same functions
the system is controlled by an integrated operating system
that provides interaction between processors and their
programs at the job, task, file, and data element levels
Performance Scaling
• a system with multiple • vendors can offer a range of
processors will yield greater products with different price
performance if work can be and performance
done in parallel characteristics
Availability Incremental Growth
• the failure of a single • an additional processor can
processor does not halt the be added to enhance
machine performance
Processor Processor Processor
L1 Cache L1 Cache L1 Cache
L2 Cache L2 Cache L2 Cache
System Bus
Main I/O
Memory I/O Adapter
Subsystem
I/O
Adapter
I/O
Adapter
Figure 1.19 Symmetric Multiprocessor Organization
Multicore Computer
Also known as a chip multiprocessor
Combines two or more processors (cores) on a
single piece of silicon (die)
each core consists of all of the components of an
independent processor
Inaddition, multicore chips also include L2
cache and in some cases L3 cache
Core 0 Core 1 Core 2 Core 3 Core 4 Core 5
32 kB 32 kB 32 kB 32 kB 32 kB 32 kB 32 kB 32 kB 32 kB 32 kB 32 kB 32 kB
L1-I L1-D L1-I L1-D L1-I L1-D L1-I L1-D L1-I L1-D L1-I L1-D
256 kB 256 kB 256 kB 256 kB 256 kB 256 kB
L2 Cache L2 Cache L2 Cache L2 Cache L2 Cache L2 Cache
12 MB
L3 Cache
DDR3 Memory QuickPath
Controllers Interconnect
3 8B @ 1.33 GT/s 4 20b @ 6.4 GT/s
Figure 1.20 Intel Core i7-990X Block Diagram
Summary
Basic Elements Cache memory
Motivation
Evolution of the
microprocessor Cache principles
Cache design
Instruction execution
Direct memory access
Interrupts
Interrupts and the Multiprocessor and
instruction cycle multicore organization
Interrupt processing Symmetric
multiprocessors
Multiple interrupts
Multicore computers
The memory hierarchy