Basic Processing Unit
Overview
Instruction
Set Processor (ISP)
Central Processing Unit (CPU)
A typical computing task consists of a series
of steps specified by a sequence of machine
instructions that constitute a program.
An instruction is executed by carrying out a
sequence of more rudimentary operations.
Fundamental Concepts
Processor fetches one instruction at a time and
perform the operation specified.
Instructions are fetched from successive memory
locations until a branch or a jump instruction is
encountered.
Processor keeps track of the address of the memory
location containing the next instruction to be fetched
using Program Counter (PC).
Instruction Register (IR)
Executing an Instruction
Fetch the contents of the memory location pointed
to by the PC. The contents of this location are
loaded into the IR (fetch phase).
IR [[PC]]
Assuming that the memory is byte addressable,
increment the contents of the PC by 4 (fetch phase).
PC [PC] + 4
Carry out the actions specified by the instruction in
the IR (execution phase).
Processor Organization
Internalprocessor
bus
Controlsignals
PC
MDR HAS
TWO INPUTS
AND TWO
OUTPUTS
Instruction
Address
lines
decoderand
MAR
controllogic
Memory
bus
MDR
Data
lines
IR
Datapath
Y
R0
Constant4
Select
MUX
Add
ALU
control
lines
Sub
Carryin
XOR
Textbook Page 413
R n 1
ALU
TEMP
Z
Figure7.1.Singlebusorganizationofthedatapathinsideaprocessor.
Executing an Instruction
Transfer
a word of data from one processor
register to another or to the ALU.
Perform an arithmetic or a logic operation
and store the result in a processor register.
Fetch the contents of a given memory
location and load them into a processor
register.
Store a word of data from a processor
register into a given memory location.
Register Transfers
Internalprocessor
bus
Riin
Ri
Riout
Yin
Y
Constant4
Select
MUX
A
B
ALU
Zin
Z
Zout
Figure7.2.InputandoutputgatingfortheregistersinFigure7.1.
Register Transfers
All operations and data transfers are controlled by the processor clock.
Bus
0
D
1
Q
Ri in
Riout
Clock
Figure7.3.
Figure 7.3.Inputandoutputgatingforoneregisterbit.
Input and output gating for one register bit.
Performing an Arithmetic or
Logic Operation
The ALU is a combinational circuit that has no
internal storage.
ALU gets the two operands from MUX and bus.
The result is temporarily stored in register Z.
What is the sequence of operations to add the
contents of register R1 to those of R2 and store the
result in R3?
1.
2.
3.
R1out, Yin
R2out, SelectY, Add, Zin
Zout, R3in
Fetching a Word from Memory
Address into MAR; issue Read operation; data into MDR.
Memorybus
datalines
MDRoutE
MDRout
Internalprocessor
bus
MDR
MDRinE
MDRin
Figure7.4.
Figure 7.4. ConnectionandcontrolsignalsforregisterMDR.
Connection and control signals for register MDR.
Fetching a Word from Memory
The response time of each memory access varies
(cache miss, memory-mapped I/O,).
To accommodate this, the processor waits until it
receives an indication that the requested operation
has been completed (Memory-Function-Completed,
MFC).
Move (R1), R2
MAR [R1]
Start a Read operation on the memory bus
Wait for the MFC response from the memory
Load MDR from the memory bus
R2 [MDR]
Timing
Assume MAR
is always available
on the address lines
of the memory bus.
Step
Clock
MARin
MAR [R1]
Address
Read
Start a Read operation on the memory bus
MR
MDRinE
Data
Wait for the MFC response from the memory
MFC
MDRout
Load MDR from the memory bus
R2 [MDR]
Figure7.5. TimingofamemoryReadoperation.
Execution of a Complete
Instruction
Add
(R3), R1
Fetch the instruction
Fetch the first operand (the contents of the
memory location pointed to by R3)
Perform the addition
Load the result into R1
Architecture
Internalprocessor
bus
Riin
Ri
Riout
Yin
Y
Constant4
Select
MUX
A
B
ALU
Zin
Z
Zout
Figure7.2.InputandoutputgatingfortheregistersinFigure7.1.
Execution of a Complete
Instruction
Internalprocessor
bus
Add (R3), R1
Controlsignals
PC
Step A ction
1
PCout , MAR in , Read, Select4,Add, Zin
Zout , PCin , Y in , WMF C
MDRout , IR in
R3out , MAR in , Read
R1out , Y in , WMF C
MDRout , SelectY,Add, Zin
Zout , R1in , End
Instruction
Address
lines
decoderand
MAR
controllogic
Memory
bus
MDR
Data
lines
IR
Y
R0
Constant4
Select
MUX
Add
ALU
control
lines
Sub
R n 1
ALU
Carryin
XOR
TEMP
Figure 7.6. C ontrol sequence
forexecutionof theinstructionAdd (R3),R1.
Z
Figure7.1.Singlebusorganizationofthedatapathinsideaprocessor.
Execution of Branch
Instructions
A
branch instruction replaces the contents of
PC with the branch target address, which is
usually obtained by adding an offset X given
in the branch instruction.
The offset X is usually the difference between
the branch target address and the address
immediately following the branch instruction.
Conditional branch
Execution of Branch
Instructions
StepAction
1
PCout, MAR in , Read,Select4,Add, Zin
Zout, PC in , Y in, WMF C
MDR out , IR in
Offset-field-of-IR
out, Add, Zin
Zout, PCin , End
Figure 7.7. Control sequence for an unconditional branch instruction
Multiple-Bus Organization
BusA
BusB
BusC
Incrementer
PC
Register
file
MUX
Constant4
A
ALU R
B
Instruction
decoder
IR
MDR
MAR
Memorybus
datalines
Address
lines
Figure7.8. Threeb usorganizationofthedatapath.
Multiple-Bus Organization
Add R4, R5, R6
StepAction
1
PCout, R=B, MAR in, Read, IncPC
WMFC
MDRoutB, R=B,IR in
R4outA, R5outB, SelectA,Add, R6in, End
Figure 7.9.
Control sequence for the instruction. Add R4,R5,R
for the three-bus organization in Figure 7.8.
Quiz
Internalprocessor
bus
Controlsignals
What
is the control
sequence for
execution of the
instruction
Add R1, R2
including the
instruction fetch
phase? (Assume
single bus
architecture)
PC
Instruction
Address
lines
decoderand
MAR
controllogic
Memory
bus
MDR
Data
lines
IR
Y
R0
Constant4
Select
MUX
Add
ALU
control
lines
Sub
R n 1
ALU
Carryin
XOR
TEMP
Z
Figure7.1.Singlebusorganizationofthedatapathinsideaprocessor.
Hardwired Control
Overview
To
execute instructions, the processor must
have some means of generating the control
signals needed in the proper sequence.
Two categories: hardwired control and
microprogrammed control
Hardwired system can operate at high speed;
but with little flexibility.
Control Unit Organization
Clock
CLK
Controlstep
counter
External
inputs
IR
Decoder/
encoder
Condition
codes
Controlsignals
Figure7.10.Controlunitorganization.
Detailed Block Description
Clock
CLK
Controlstep
counter
Reset
Stepdecoder
T 1 T2
Tn
INS1
External
inputs
INS2
IR
Instruction
decoder
Encoder
Condition
codes
INSm
Run
End
Controlsignals
Figure7.11. Separationofthedecodingandencodingfunctions.
Generating Zin
Zin = T1 + T6 ADD + T4 BR +
Branch
T4
Add
T6
T1
Figure7.12.GenerationoftheZincontrolsignalfortheprocessorinFigure7.1.
Generating End
End = T7 ADD + T5 BR + (T5 N + T4 N) BRN +
Branch<0
Add
T7
Branch
T5
T4
End
Figure7.13. GenerationoftheEndcontrolsignal.
T5
A Complete Processor
Instruction
unit
Integer
unit
Instruction
cache
Floatingpoint
unit
Data
cache
Businterface
Processor
Systembus
Main
memory
Input/
Output
Figure7.14. Blockdiagramofacompleteprocessor.
Microprogrammed
Control
Overview
Micro
instruction
PCout
MARin
Read
MDRout
IRin
Yin
Select
Add
Zin
Z out
R1out
R1 in
R3out
WMFC
End
Control signals are generated by a program similar to machine language
programs.
Control Word (CW); microroutine; microinstruction
PCin
Figure7.15 AnexampleofmicroinstructionsforFigure7.6.
Overview
Step A ction
1
PCout , MAR in , Read, Select4,Add, Zin
Zout , PCin , Y in , WMF C
MDRout , IR in
R3out , MAR in , Read
R1out , Y in , WMF C
MDRout , SelectY,Add, Zin
Zout , R1in , End
Figure 7.6. C ontrol sequence
forexecutionof theinstructionAdd (R3),R1.
Overview
Control store
IR
Starting
address
generator
Clock
P C
Control
store
One function
cannot be carried
out by this simple
organization.
CW
Figure7.16. Basicorganizationofamicroprogrammedcontrolunit.
Overview
The previous organization cannot handle the situation when the control
unit is required to check the status of the condition codes or external
inputs to choose between alternative courses of action.
Use conditional branch microinstruction.
Address
Microinstruction
0
PCout , MAR in , Read,Select4,Add, Zin
Zout , PC in , Y in , WMFC
MDRout , IR in
3
Branchto startingaddress
of appropriate
microroutine
. ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... .. ... ... .. ... ..
25
If N=0, thenbranchto microinstruction
0
26
Offset-field-of-IR
out , SelectY,Add, Zin
27
Zout , PC in , End
Figure 7.17. Microroutine for the instruction Branch<0.
Overview
External
inputs
IR
Startingand
branchaddress
generator
Clock
PC
Control
store
Figure7.18.
Condition
codes
CW
Organizationofthecontrolunittoallow
conditionalbranchinginthemicroprogram.
Microinstructions
A
straightforward way to structure
microinstructions is to assign one bit position
to each control signal.
However, this is very inefficient.
The length can be reduced: most signals are
not needed simultaneously, and many signals
are mutually exclusive.
All mutually exclusive signals are placed in
the same group in binary coding.
Partial Format for the
Microinstructions
Microinstruction
F1
F2
F3
F4
F5
F1(4bits)
F2(3bits)
F3(3bits)
F4(4bits)
F5(2bits)
0000:Notransfer
0001:PCout
0010:MDRout
0011:Zout
0100:R0out
0101:R1out
0110:R2out
0111:R3out
1010:TEMPout
1011:Offsetout
000:Notransfer
001:PCin
010:IRin
011:Zin
100:R0in
101:R1in
110:R2in
111:R3in
000:Notransfer
001:MARin
010:MDRin
011:TEMPin
100:Yin
0000:Add
0001:Sub
00:Noaction
01:Read
10:Write
F6
F7
1111:XOR
16ALU
functions
F8
F6(1bit)
F7(1bit)
F8(1bit)
0:SelectY
1:Select4
0:Noaction
1:WMFC
0:Continue
1:End
Figure7.19. Anexampleofapartialformatforfieldencodedmicroinstructions.
What is the price paid for
this scheme?
Further Improvement
Enumerate
the patterns of required signals in
all possible microinstructions. Each
meaningful combination of active control
signals can then be assigned a distinct code.
Vertical organization
Horizontal organization
Microprogram Sequencing
If all microprograms require only straightforward
sequential execution of microinstructions except for
branches, letting a PC governs the sequencing
would be efficient.
However, two disadvantages:
Having a separate microroutine for each machine instruction results
in a large total number of microinstructions and a large control store.
Longer execution time because it takes more time to carry out the
required branches.
Example: Add src, Rdst
Four addressing modes: register, autoincrement,
autodecrement, and indexed (with indirect forms).
- Bit-ORing
- Wide-Branch Addressing
- WMFC
Mode
ContentsofIR
OPcode
11 10
Rsrc
8 7
Rdst
4 3
Address
(octal)
Microinstruction
000
PCout,MARin,Read,Select 4 ,Add,Zin
001
Zout,PCin,Yin,WMFC
002
MDRout,IRin
003
Branch{ PC 101(fromInstructiondecoder);
PC5,4 [IR10,9]; PC3 [IR10] [IR9] [IR8]}
121
Rsrcout ,MARin ,Read,Select4,Add,Z in
122
Zout,Rsrcin
123
Branch{PC 170; PC0 [IR8]},WMFC
170
MDRout,MARin,Read,WMFC
171
MDRout,Yin
172
Rdstout ,SelectY,Add,Z in
173
Zout,Rdstin,End
Figure7.21. MicroinstructionforAdd(Rsrc)+,Rdst.
Note:Microinstructionatlocation170isnotexecutedforthisaddressingmode.
Microinstructions with NextAddress Field
The microprogram we discussed requires several
branch microinstructions, which perform no useful
operation in the datapath.
A powerful alternative approach is to include an
address field as a part of every microinstruction to
indicate the location of the next microinstruction to
be fetched.
Pros: separate branch microinstructions are virtually
eliminated; few limitations in assigning addresses to
microinstructions.
Cons: additional bits for the address field (around
1/6)
Microinstructions with NextAddress Field
IR
External
Inputs
Condition
codes
Decodingcircuits
AR
Controlstore
I R
Nextaddress
Microinstructiondecoder
Controlsignals
Figure7.22.Microinstructionsequencingorganization.
Microinstruction
F0
F0(8bits)
F1
F1(3bits)
F2
F2(3bits)
Addressofnext 000:Notransfer 000:Notransfer
microinstruction 001:PCout
001:PCin
010:IRin
010:MDRout
011:Zout
011:Zin
100:Rsrcout
100:Rsrcin
101:Rdstout
101:Rdstin
110:TEMPout
F4
F5
F6
F3
F3(3bits)
000:Notransfer
001:MARin
010:MDRin
011:TEMPin
100:Yin
F7
F4(4bits)
F5(2bits)
F6(1bit)
F7(1bit)
0000:Add
0001:Sub
00:Noaction
01:Read
10:Write
0:SelectY
1:Select4
0:Noaction
1:WMFC
F9
F10
1111:XOR
F8
F8(1bit)
F9(1bit)
F10(1bit)
0:NextAdrs
1:InstDec
0:Noaction
1:ORmode
0:Noaction
1:ORindsrc
Figure7.23. FormatformicroinstructionsintheexampleofSection7.5.3.
Implementation of the
Microroutine
Octal
address
0
0
0
0
0
0
0
0
0
0
0
0
121
122
0 1 0 1 0 0 1 0 1 0 0 01 1 0 0 1 0 0 0 0 01
0 1 1 1 1 0 0 0 0 1 1 10 0 0 0 0 0 0 0 0 00
1
0
0
1
0
0
0
0
0
1
1
1
1
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
0
0
0
0
0
0
0
1
0
0
1
1
0
1
1
0
1
1
1
0
0
0
0
1
1
01
00
01
00
00
00
01
10
1
1
0
0
0
0
1
1
0
1
0
0
0
1
0
0
0
0
0
0
0
0
0
0
1
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
F6 F7 F8 F9 F10
0
0
0
0
1
1
1
0
0
0
0
0
F5
0
0
0
1
1
1
1
0
0
0
0
0
F4
0
0
0
1
1
1
1
0
0
0
0
0
F3
0
1
0
0
1
1
1
0
0
0
0
0
F2
1
0
0
0
0
1
2
3
0
0
0
0
F1
01
00
00
00
7
7
7
7
0
1
2
3
F0
0
0
0
0
01
00
00
00
Figure7.24.ImplementationofthemicroroutineofFigure7.21usinga
nextmicroinstructionaddressfield. (SeeFigure7.23forencodedsignals.)
R15in
R15out
R0 in
R0out
Decoder
Decoder
IR
Rsrc
Rdst
InstDecout
External
inputs
Decoding
circuits
Condition
codes
ORmode
ORindsrc
AR
Controlstore
Nextaddress
F1
F2
F8 F9 F10
Rdstout
Rdstin
Rsrcout
Microinstruction
decoder
Rsrcin
Othercontrolsignals
Figure7.25. Somedetailsofthecontrolsignalgeneratingcircuitry.
bit-ORing