Chapter 6: Processor Structure
And Function
Computer Architecture
and Organization
Department of Computer Engineering
CPU Structure
Processor must do:
Fetch instructions: The processor reads an instruction from
memory (register, cache, main memory)
Interpret instructions: The instruction is decoded to determine what action is
required
Fetch data: The execution of an instruction may require reading data from
memory or an I/O module.
Process data: The execution of an instruction may require performing some
arithmetic or logical operation on data
Write data: The results of an execution may require writing data to memory or
an I/O module.
Computer Architecture and Organization 2
CPU Structure
To do those things the processor needs to:
Store some data temporarily
It must remember the location of the last instruction so that it can
know where to get the next instruction.
It needs to store instructions and data temporarily while an
instruction is being executed.
In other words, the processor needs a small internal memory
Computer Architecture and Organization
3
CPU Structure
The major components of the processor are an arithmetic and logic
unit (ALU) and a control unit (CU)
The ALU does the actual computation or processing of data
The control unit controls the movement of data and instructions into
and out of the processor and controls the operation of the ALU
In addition to CU and ALU, there is minimal internal memory,
consisting of a set of storage locations, called registers.
Computer Architecture and Organization
4
CPU With Systems Bus
Computer Architecture and Organization 5
CPU Internal Structure
Computer Architecture and Organization 6
Register Organization
Within the processor, there is a set of registers which stores data and
instruction temporarily
Number and function of registers vary between processor designs
The registers in the processor perform two roles:
User-visible registers: Enable the machine- or assembly language
programmer to minimize main memory references by optimizing use
of registers.
Control and status register: Used by the control unit to control the
operation of the processor and by privileged, operating system
programs to control the execution of programs.
Computer Architecture and Organization 7
User Visible Registers
A user-visible register is one that may be referenced by means of the
machine language that the processor executes
We can characterize these in the following categories:
General Purpose
Data
Address
Condition Codes
Computer Architecture and Organization 8
General Purpose Registers
can be assigned to a variety of functions by the programmer
any general-purpose register can contain the operand for any opcode.
This provides true general-purpose register use
however, there are restrictions. For example, there may be dedicated
registers for floating-point and stack operations
In some cases, general-purpose registers can be used for addressing
functions
In other cases, there is a partial or clean separation between data
registers and address registers
Computer Architecture and Organization 9
Data, Address Registers
Data registers may be used only to hold data and cannot be employed in the
calculation of an operand address
Address registers may themselves be somewhat general purpose, or they may be
devoted to a particular addressing mode, example:
Segment pointers: a segment register holds the address of the base of the segment
Index registers: These are used for indexed addressing and may be auto indexed.
Stack pointer: If there is user-visible stack addressing, then typically there is a
dedicated register that points to the top of the stack.
This allows implicit addressing; that is, push, pop, and other stack instructions need
not contain an explicit stack operand
Computer Architecture and Organization 10
Registers design issues
An important issue is whether to use completely general-purpose
registers or to specialize their use
With the use of specialized registers:
be implicit in the opcode which type of register a certain operand
specifier refers to.
limits the programmer’s flexibility
the instruction length reduces
Another design issue is the number of registers, either general purpose
or data plus address, to be provided
Again, this affects instruction set design because more registers
Computer Architecture and Organization 11
Registers design issues
require more operand specifier bits.
Fewer registers result in more memory references
more registers do not noticeably reduce memory references
Finally, there is the issue of register length
Registers that must hold addresses obviously must be at least long enough to hold
the largest address
Data registers should be able to hold values of most data types
If we make Registers in processor general purpose
Increase flexibility and programmer options
Increase instruction size & complexity
Computer Architecture and Organization 12
Condition Code Registers
A final category of registers, which is at least partially visible to the user, holds
condition codes
Also referred to as flags
Condition codes are bits set by the processor hardware as the result of operations.
e.g. an arithmetic operation may produce a positive, negative, zero, or overflow
result
In addition to the result itself being stored in a register or memory, a condition
code is also set
Condition code bits are collected into one or more registers. Usually, they form
part of a control register.
Generally, machine instructions allow these bits to be read by implicit reference,
but the programmer cannot alter them.
Computer Architecture and Organization 13
Control & Status Registers
There are a variety of processor registers that are employed to control the
operation of the processor
Four registers are essential to instruction execution
Program counter (PC): Contains the address of an instruction to be fetched
Instruction register (IR): Contains the instruction most recently fetched
Memory address register (MAR): Contains the address of a location in memory
Memory buffer register (MBR): Contains a word of data to be written to
memory or the word most recently read
Not all processors have internal registers designated as MAR and MBR, but
some equivalent buffering mechanism is needed
Data are exchanged with memory using the MAR and MBR
Computer Architecture and Organization 14
Control & Status Registers
In a bus-organized system, the MAR connects directly to the address bus, and the
MBR connects directly to the data bus
User visible registers, in turn, exchange data with the MBR.
Within the processor, data must be presented to the ALU for processing
The ALU may have direct access to the MBR and user-visible registers
Alternatively, there may be additional buffering registers at the boundary to the
ALU
These registers serve as input and output registers for the ALU and exchange
data with the MBR and user-visible registers.
Computer Architecture and Organization 15
Program Status Word
Many processor designs include a register or set of registers, often known as the
program status word (PSW)
The PSW typically contains condition codes plus other status information
Common fields or flags include the following:
Sign: Contains the sign bit of the result of the last arithmetic operation.
Zero: Set when the result is 0.
Carry: Set if an operation resulted in a carry (addition) into or borrow
(subtraction) out of a high-order bit.
Equal: Set if a logical compare result is equality.
Overflow: Used to indicate arithmetic overflow.
Computer Architecture and Organization 16
Program Status Word
Interrupt Enable/Disable: Used to enable or disable interrupts.
Supervisor: Indicates whether the processor is executing in
supervisor or user mode
In supervisor mode, only privileged instructions can be executed and
certain areas of memory can be accessed
Computer Architecture and Organization 17
Other Registers
A number of other registers related to status and control might be
found in a particular processor design
There may be a pointer to a block of memory containing additional
status information (e.g., process control blocks).
In machines using vectored interrupts, an interrupt vector register
may be provided.
If a stack is used to implement certain functions (e.g., subroutine
call), then a system stack pointer is needed
A page table pointer is used with a virtual memory system
Finally, registers may be used in the control of I/O operations.
Computer Architecture and Organization 18
Example Microprocessor Register Organizations of three processors
Computer Architecture and Organization 19
Instruction Cycle
An instruction cycle includes the following stages:
· Fetch: Read the next instruction from memory into the processor.
· Execute: Interpret the opcode and perform the indicated operation.
· Interrupt: If interrupts are enabled and an interrupt has occurred, save
the current process state and service the interrupt.
Computer Architecture and Organization 20
Instruction Cycle with Indirect
Computer Architecture and Organization 21
Indirect Cycle
The execution of an instruction may involve one or more operands in
memory, each of which requires a memory access.
Further, if indirect addressing is used, then additional memory
accesses are required.
We can think of the fetching of indirect addresses as one more
instruction stages.
After an instruction is fetched, it is examined to determine if any
indirect addressing is involved
If so, the required operands are fetched using indirect addressing
Computer Architecture and Organization 22
Instruction Cycle State Diagram
Computer Architecture and Organization 23
Data Flow (Instruction Fetch)
The exact sequence of events during an instruction cycle depends on the design
of the processor
Let us assume that a processor that employs a memory address register (MAR), a
memory buffer register (MBR), a program counter (PC), and an instruction
register (IR).
During the fetch cycle, an instruction is read from memory
PC contains address of next instruction to be fetched
This address is moved to the MAR and placed on the address bus
The control unit requests a memory read, and the result is placed on the data bus
and copied into the MBR and then moved to the IR
Meanwhile, the PC is incremented by 1, preparatory for the next fetch.
Computer Architecture and Organization 24
Data Flow (Fetch Diagram)
Computer Architecture and Organization 25
Data Flow (Data Fetch)
Once the fetch cycle is over, the control unit examines the contents
of the IR to determine if it contains an operand specifier using
indirect addressing.
If so, an indirect cycle is performed.
The rightmost N bits of the MBR, which contain the address
reference, are transferred to the MAR.
Then the control unit requests a memory read, to get the desired
address of the operand into the MBR.
Computer Architecture and Organization 26
Data Flow (Indirect Diagram)
Computer Architecture and Organization 27
Data Flow (Execute)
Takes many forms; the form depends on which of the various
machine instructions is in the IR
This cycle may involve:
transferring data among registers
Memory read/write
Input/ Output read/write
ALU operations
Computer Architecture and Organization 28
Data Flow (Interrupt)
The current contents of the PC must be saved so that the processor can
resume normal activity after the interrupt.
Thus, the contents of the PC are transferred to the MBR to be written
into memory.
The special memory location reserved for this purpose is loaded into
the MAR from the control unit.
It might, for example, be a stack pointer.
The PC is loaded with the address of the interrupt routine.
As a result, the next instruction cycle will begin by fetching the
appropriate instruction.
Computer Architecture and Organization 29
Data Flow (Interrupt Diagram)
Computer Architecture and Organization 30
INSTRUCTION PIPELINING
In computer system greater performance can be achieved:
by taking advantage of improvements in technology, such as faster
circuitry, and
Organizational enhancements to the processor
As we have seen before some of the organizational approaches are:
The use of multiple registers rather than a single accumulator
The use of a cache memory
Another organizational approach, which is quite common, is
instruction pipelining
Computer Architecture and Organization 31
INSTRUCTION PIPELINING
The process of fetching an instruction during the execution of
previous instruction is known as instruction pipelining
As a simple approach, consider subdividing instruction processing
into two stages: fetch instruction and execute instruction
There are times during the execution of an instruction when main
memory is not being accessed.
This time could be used to fetch the next instruction in parallel with
the execution of the current one
The pipeline has two independent stages. The first stage fetches an
instruction and buffers it
Computer Architecture and Organization 32
INSTRUCTION PIPELINING
When the second stage is free, the first stage passes it the buffered
instruction
While the second stage is executing the instruction, the first stage
takes advantage of any unused memory cycles to fetch and buffer
the next instruction.
This is called instruction prefetch or fetch overlap
This approach, which involves instruction buffering, requires more
registers
In general, pipelining requires registers to store data between
stages.
Computer Architecture and Organization 33
INSTRUCTION PIPELINING
It should be clear that this process will speed up instruction execution:
If the fetch and execute stages were of equal duration, the instruction cycle time
would be halved.
However, if we look more closely at this pipeline in figure 2 (a), we will see that
this doubling of execution rate is unlikely for two reasons:
1. The execution time will generally be longer than the fetch time. Thus, the fetch
stage may have to wait for some time before it can empty its buffer.
2. A conditional branch instruction makes the address of the next instruction to be
fetched unknown.
Thus, the fetch stage must wait until it receives the next instruction address from
the execute stage.
The execute stage may then have to wait while the next instruction is fetched.
Computer Architecture and Organization 34
INSTRUCTION PIPELINING
Guessing can reduce the time loss from the second reason. A simple
rule is the following:
When a conditional branch instruction is passed on from the fetch to
the execute stage, the fetch stage fetches the next instruction in
memory after the branch instruction.
Then, if the branch is not taken, no time is lost.
If the branch is taken, the fetched instruction must be discarded and
a new instruction fetched.
Computer Architecture and Organization 35
Figure(2): Two Stage Instruction Pipeline
Computer Architecture and Organization 36
INSTRUCTION PIPELINING
While these factors reduce the potential effectiveness of the two-stage
pipeline, some speedup occurs.
To gain further speedup, the pipeline must have more stages.
Fetch instruction (FI): Read the next expected instruction into a buffer.
Decode instruction (DI): Determine the opcode and the operand
specifiers.
Calculate operands (CO): Calculate the effective address of each
source operand
Fetch operands (FO): Fetch each operand from memory. Operands in
registers need not be fetched.
Computer Architecture and Organization 37
INSTRUCTION PIPELINING
Execute instruction (EI): Perform the indicated operation and store
the result, if any, in the specified destination operand location.
Write operand (WO): Store the result in memory
With this decomposition, the various stages will be of more nearly
equal duration.
For the sake of illustration, look figure 3 and then
let us assume each stage have equal duration
There are no memory conflicts
Assumes that each instruction goes through all six stages of the
pipeline
Computer Architecture and Organization 38
Figure (3), Timing Diagram for Instruction Pipeline Operation
Computer Architecture and Organization 39
Factors that Limits performance enhancement
If the six stages are not of equal duration
The conditional branch instruction, which can invalidate several
instruction fetches
Each instruction may not go through all six stages
E.g: Load instruction does not need the WO Stage
Memory conflict
FI, FO & WO stages involves a memory access
But values at cache or FO or WO stage may be null
Interrupt
Computer Architecture and Organization 40
Figure 4: The Effect of a Conditional Branch on Instruction
Pipeline Operation
Computer Architecture and Organization 41
Figure 5: Six-Stage CPU Instruction Pipeline
Computer Architecture and Organization 42
Pipeline Hazards
occurs when the pipeline, or some portion of the pipeline, must
stall because conditions do not permit continued execution
Such a pipeline stall is also referred to as a pipeline bubble
There are three types of hazards:
Resource
Data
Control
Computer Architecture and Organization 43
Resource Hazards
A resource hazard occurs when two (or more) instructions that are
already in the pipeline need the same resource
The result is that the instructions must be executed in serial rather
than parallel for a portion of the pipeline
A resource hazard is sometime referred to as a structural hazard
Let us consider a simple example of a resource hazard in figure 6.
Assume a simplified five stage pipeline, in which each stage takes
one clock cycle
Figure 6(a) shows the ideal case, in which a new instruction enters
the pipeline each clock cycle.
Computer Architecture and Organization 44
Resource Hazards
Now assume that main memory has a single port and that all
instruction fetches and data reads and writes must be performed one
at a time
Operand read to or write from memory cannot be performed in
parallel with an instruction fetch
In Figure 6(b) , which assumes that the source operand for
instruction I1 is in memory, rather than a register
Therefore, the fetch instruction stage of the pipeline must idle for
one cycle before beginning the instruction fetch for instruction I3
The figure assumes that all other operands are in registers
Computer Architecture and Organization 45
Resource Hazards
Another example of a resource conflict is a situation in which
multiple instructions are ready to enter the execute instruction phase
and there is a single ALU.
One solutions to such resource hazards is to increase available
resources, such as having multiple ports into main memory and
multiple ALU units.
Computer Architecture and Organization 46
Figure 6: Resource Hazard Diagram
Computer Architecture and Organization 47
Data Hazards
A data hazard occurs when there is a conflict in the access of an operand
location.
If two instructions are executed in strict sequence, no problem occurs
If the instructions are executed in a pipeline, then it is possible for the operand
value to be updated in such a way as to produce a different result than would
occur with strict sequential execution
In other words, the program produces an incorrect result because of the use of
pipelining
As an example, consider the following x86 machine instruction sequence:
Computer Architecture and Organization 48
Data Hazards
The first instruction adds the contents of the 32-bit registers EAX and EBX
and stores the result in EAX
The second instruction subtracts the contents of EAX from ECX and stores the
result in ECX
The ADD instruction does not update register EAX until the end of stage 5,
which occurs at clock cycle 5
But the SUB instruction needs that value at the beginning of its stage 2, which
occurs at clock cycle 4
To maintain correct operation, the pipeline must stall for two clocks cycles
Thus, in the absence of special hardware and specific avoidance algorithms,
such a data hazard results in inefficient pipeline usage.
Computer Architecture and Organization 49
Figure 7: Data Hazard Diagram for Given Instruction
Computer Architecture and Organization 50
Types of Data Hazard
There are three types of data hazards;
Read after write (RAW), or true dependency
· An instruction modifies a register or memory location and,
· Succeeding instruction reads data in that memory or register location
· A hazard occurs if the read takes place before the write operation is
complete
Write after read (WAR), or anti dependency
· An instruction reads a register or memory location
· Succeeding instruction writes to location
· A hazard occurs if the write operation completes before the read operation
takes place
Computer Architecture and Organization 51
Types of Data Hazard
Write after write (WAW), or output dependency
· Two instructions both write to the same location
· A hazard occurs if the write operations take place in the reverse
order of the intended sequence
· The previous example is RAW hazard
Computer Architecture and Organization 52
Control Hazard
A control hazard, also known as a branch hazard
Occurs when the pipeline makes the wrong decision on a branch
prediction
Therefore brings instructions into pipeline that must subsequently
be discarded
Computer Architecture and Organization 53
Dealing with Branches
One of the major problems in designing an instruction pipeline is assuring a
steady flow of instructions to the initial stages of the pipeline
Until the instruction is actually executed, it is impossible to determine whether
the branch will be taken or not
A variety of approaches have been taken for dealing with conditional branches:
Multiple streams
Prefetch branch target
Loop buffer
Branch prediction
Delayed branch
Computer Architecture and Organization 54
End of Chapter 6
Computer Architecture
and Organization
Department of Computer Engineering