Chapter 7.
Basic
Processing Unit
Overview
Instruction Set Processor (ISP)
Central Processing Unit (CPU)
A typical computing task consists of a series
of steps specified by a sequence of machine
instructions that constitute a program.
An instruction is executed by carrying out a
sequence of more rudimentary operations.
Some Fundamental
Concepts
Fundamental Concepts
Processor fetches one instruction at a time and
perform the operation specified.
Instructions are fetched from successive memory
locations until a branch or a jump instruction is
encountered.
Processor keeps track of the address of the memory
location containing the next instruction to be fetched
using Program Counter (PC).
Instruction Register (IR)
Executing an Instruction
Fetch the contents of the memory location pointed
to by the PC. The contents of this location are
loaded into the IR (fetch phase).
IR ← [[PC]]
Assuming that the memory is byte addressable,
increment the contents of the PC by 4 (fetch phase).
PC ← [PC] + 4
Carry out the actions specified by the instruction in
the IR (execution phase).
Processor Organization Internal processor
bus
Control signals
PC
Instruction
Address
decoder and
lines
MDR HAS MAR control logic
TWO INPUTS Memory
AND TWO bus
OUTPUTS MDR
Data
lines IR
Datapath
Y
Constant 4 R0
Select MUX
Add
A B
ALU Sub R n - 1
control ALU
lines
Carry-in
XOR TEMP
Textbook Page 413
Figure 7.1. Single-bus organization of the datapath inside a processor.
Executing an Instruction
Transfer a word of data from one processor
register to another or to the ALU.
Perform an arithmetic or a logic operation
and store the result in a processor register.
Fetch the contents of a given memory
location and load them into a processor
register.
Store a word of data from a processor
register into a given memory location.
Register Transfers Riin
Internal processor
bus
Ri
Riout
Yin
Constant 4
Select MUX
A B
ALU
Zin
Z out
Figure 7.2. Input and output gating for the registers in Figure 7.1.
Register Transfers
All operations and data transfers are controlled by the processor clock.
Bus
D Q
1
Q
Riout
Ri in
Clock
Figure 7.3.
Figure 7.3.Input
Inputand
andoutput
output gating for one register
register bit.
bit.
Performing an Arithmetic or
Logic Operation
The ALU is a combinational circuit that has no
internal storage.
ALU gets the two operands from MUX and bus.
The result is temporarily stored in register Z.
What is the sequence of operations to add the
contents of register R1 to those of R2 and store the
result in R3?
1. R1out, Yin
2. R2out, SelectY, Add, Zin
3. Zout, R3in
Fetching a Word from Memory
Address into MAR; issue Read operation; data into MDR.
Memory-bus Internal processor
data lines MDRoutE MDRout bus
MDR
MDR inE MDRin
Figure 7.4.
Figure 7.4. Connection and control
Connection and controlsignals
signalsfor
forregister
registerMDR.
MDR.
Fetching a Word from Memory
The response time of each memory access varies
(cache miss, memory-mapped I/O,…).
To accommodate this, the processor waits until it
receives an indication that the requested operation
has been completed (Memory-Function-Completed,
MFC).
Move (R1), R2
MAR ← [R1]
Start a Read operation on the memory bus
Wait for the MFC response from the memory
Load MDR from the memory bus
R2 ← [MDR]
Step 1 2 3
Timing Clock
MARin MAR ← [R1]
Assume MAR
is always available Address
on the address lines
of the memory bus. Start a Read operation on the memory bus
Read
MR
MDRinE
Data
Wait for the MFC response from the memory
MFC
MDR out Load MDR from the memory bus
R2 ← [MDR]
Figure 7.5. Timing of a memory Read operation.
Execution of a Complete
Instruction
Add (R3), R1
Fetch the instruction
Fetch the first operand (the contents of the
memory location pointed to by R3)
Perform the addition
Load the result into R1
Architecture Riin
Internal processor
bus
Ri
Riout
Yin
Constant 4
Select MUX
A B
ALU
Zin
Z out
Figure 7.2. Input and output gating for the registers in Figure 7.1.
Execution of a Complete
Instruction Internal processor
bus
Add (R3), R1 Control signals
PC
Instruction
Step Action Address
decoder and
lines
MAR control logic
1 PC out , MAR in , Read, Select4, Add, Zin Memory
bus
2 Zout , PC in , Yin , WMF C MDR
Data
lines IR
3 MDR out , IR in
4 R3out , MAR in , Read Y
Constant 4 R0
5 R1out , Yin , WMF C
6 MDR out , SelectY, Add, Zin Select MUX
7 Zout , R1 in , End Add
A B
ALU Sub R n - 1
control ALU
lines
Carry-in
XOR TEMP
Figure 7.6. Con trol sequence for execution of the instruction Add (R3),R1.
Z
Figure 7.1. Single-bus organization of the datapath inside a processor.
Execution of Branch
Instructions
A branch instruction replaces the contents of
PC with the branch target address, which is
usually obtained by adding an offset X given
in the branch instruction.
The offset X is usually the difference between
the branch target address and the address
immediately following the branch instruction.
Conditional branch
Execution of Branch
Instructions
Step Action
1 PCout , MAR in , Read, Select4, Add, Z in
2 Zout , PC in , Yin , WMF C
3 MDR out , IR in
4 Offset-field-of-IRout , Add, Z in
5 Z out , PCin , End
Figure 7.7. Control sequence for an unconditional branch instruction.
Multiple-Bus Organization
Bus A Bus B Bus C
Incrementer
PC
Register
file
Constant 4
MUX
A
ALU R
Instruction
decoder
IR
MDR
MAR
Memory bus Address
data lines lines
Figure 7.8. Three-b us organization of the datapath.
Multiple-Bus Organization
Add R4, R5, R6
Step Action
1 PCout , R=B, MAR in , Read, IncPC
2 WMFC
3 MDR outB , R=B, IR in
4 R4outA , R5outB , SelectA, Add, R6 in , End
Figure 7.9. Control sequence for the instruction. Add R4,R5,R6,
for the three-bus organization in Figure 7.8.
Quiz Internal processor
bus
Control signals
What is the control PC
sequence for
Instruction
Address
decoder and
lines
MAR control logic
execution of the Memory
bus
instruction Data
lines
MDR
IR
Add R1, R2 Constant 4
Y
R0
including the Select MUX
instruction fetch Add
A B
phase? (Assume ALU Sub R n - 1
control ALU
lines
Carry-in
single bus XOR TEMP
architecture)
Z
Figure 7.1. Single-bus organization of the datapath inside a processor.
Hardwired Control
Overview
To execute instructions, the processor must
have some means of generating the control
signals needed in the proper sequence.
Two categories: hardwired control and
microprogrammed control
Hardwired system can operate at high speed;
but with little flexibility.
Control Unit Organization
CLK Control step
Clock counter
External
inputs
Decoder/
IR
encoder
Condition
codes
Control signals
Figure 7.10. Control unit organization.
Detailed Block Description
CLK
Clock Control step Reset
counter
Step decoder
T 1 T2 Tn
INS 1
External
INS 2 inputs
Instruction
IR Encoder
decoder
Condition
codes
INSm
Run End
Control signals
Figure 7.11. Separation of the decoding and encoding functions.
Generating Zin
Zin = T1 + T6 • ADD + T4 • BR + …
Branch Add
T4 T6
T1
Figure 7.12. Generation of the Zin control signal for the processor in Figure 7.1.
Generating End
End = T7 • ADD + T5 • BR + (T5 • N + T4 • N) • BRN +…
Branch<0
Add Branch
N N
T7 T5 T4 T5
End
Figure 7.13. Generation of the End control signal.
A Complete Processor
Instruction Integer Floating-point
unit unit unit
Instruction Data
cache cache
Bus interface
Processor
System bus
Main Input/
memory Output
Figure 7.14. Block diagram of a complete processor.
Microprogrammed
Control
Overview
Control signals are generated by a program similar to machine
language programs.
Control Word (CW); microroutine; microinstruction
MDRout
WMFC
MAR in
Select
PCout
R1out
R3out
Micro -
Read
PCin
R1 in
Z out
Add
End
IRin
Yin
instruction
Zin
1 0 1 1 1 0 0 0 1 1 1 0 0 0 0 0 0
2 1 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0
3 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0
4 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0
5 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0
6 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 0
7 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1
Figure 7.15 An example of microinstructions for Figure 7.6.
Overview
Step Action
1 PC out , MAR in , Read, Select4, Add, Zin
2 Zout , PC in , Yin , WMF C
3 MDR out , IR in
4 R3out , MAR in , Read
5 R1out , Yin , WMF C
6 MDR out , SelectY, Add, Zin
7 Zout , R1 in , End
Figure 7.6. Con trol sequence for execution of the instruction Add (R3),R1.
Overview
Control store
Starting
IR address
generator One function
cannot be carried
out by this simple
organization.
Clock P C
Control
store CW
Figure 7.16. Basic organization of a microprogrammed control unit.
Overview
The previous organization cannot handle the situation when the control
unit is required to check the status of the condition codes or external
inputs to choose between alternative courses of action.
Use conditional branch microinstruction.
Address Microinstruction
0 PCout , MAR in , Read, Select4, Add, Z in
1 Zout , PC in , Yin , WMF C
2 MDR out , IR in
3 Branch to starting addressof appropriate microroutine
. ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... .. ... ... .. ... ..
25 If N=0, then branch to microinstruction 0
26 Offset-field-of-IR out , SelectY, Add, Z in
27 Zout , PC in , End
Figure 7.17. Microroutine for the instruction Branch<0.
Overview External
inputs
Starting and
branch address Condition
IR codes
generator
Clock PC
Control
store CW
Figure 7.18. Organization of the control unit to allow
conditional branching in the microprogram.
Microinstructions
A straightforward way to structure
microinstructions is to assign one bit position
to each control signal.
However, this is very inefficient.
The length can be reduced: most signals are
not needed simultaneously, and many signals
are mutually exclusive.
All mutually exclusive signals are placed in
the same group in binary coding.
Partial Format for the
Microinstructions
Microinstruction
F1 F2 F3 F4 F5
F1 (4 bits) F2 (3 bits) F3 (3 bits) F4 (4 bits) F5 (2 bits)
0000: No transfer 000: No transfer 000: No transfer 0000: Add 00: No action
0001: PCout 001: PCin 001: MARin 0001: Sub 01: Read
0010: MDRout 010: IRin 010: MDRin 10: Write
0011: Zout 011: Zin 011: TEMPin
0100: R0out 100: R0in 100: Yin 1111: XOR
0101: R1out 101: R1in
0110: R2out 110: R2in 16 ALU
functions
0111: R3out 111: R3 in
1010: TEMPout
1011: Offsetout
F6 F7 F8
What is the price paid for
this scheme?
F6 (1 bit) F7 (1 bit) F8 (1 bit)
0: SelectY 0: No action 0: Continue
1: Select4 1: WMFC 1: End
Figure 7.19. An example of a partial format for field-encoded microinstructions.
Further Improvement
Enumerate the patterns of required signals in
all possible microinstructions. Each
meaningful combination of active control
signals can then be assigned a distinct code.
Vertical organization
Horizontal organization
Microprogram Sequencing
If all microprograms require only straightforward
sequential execution of microinstructions except for
branches, letting a μPC governs the sequencing
would be efficient.
However, two disadvantages:
Having a separate microroutine for each machine instruction results
in a large total number of microinstructions and a large control store.
Longer execution time because it takes more time to carry out the
required branches.
Example: Add src, Rdst
Four addressing modes: register, autoincrement,
autodecrement, and indexed (with indirect forms).
- Bit-ORing
- Wide-Branch Addressing
- WMFC
Mode
Contents of IR OP code 0 1 0 Rsrc Rdst
11 10 8 7 4 3 0
Address Microinstruction
(octal)
000 PCout, MARin , Read, Select4 , Add, Zin
001 Zout , PCin, Yin, WMFC
002 MDRout, IRin
003 Branch { PC 101 (from Instruction decoder);
PC5,4 [IR 10,9]; PC3 [IR 10] [IR9] [IR8]}
121 Rsrcout , MARin , Read, Select4, Add, Zin
122 Zout , Rsrcin
123 Branch {PC 170;PC0 [IR8]}, WMFC
170 MDRout, MARin , Read, WMFC
171 MDRout, Yin
172 Rdstout , SelectY, Add, Zin
173 Zout , Rdstin , End
Figure 7.21. Microinstruction for Add (Rsrc)+,Rdst.
Note: Microinstruction at location 170 is not executed for this addressing mode.
Microinstructions with Next-
Address Field
The microprogram we discussed requires several
branch microinstructions, which perform no useful
operation in the datapath.
A powerful alternative approach is to include an
address field as a part of every microinstruction to
indicate the location of the next microinstruction to
be fetched.
Pros: separate branch microinstructions are virtually
eliminated; few limitations in assigning addresses to
microinstructions.
Cons: additional bits for the address field (around
1/6)
Microinstructions with Next-
Address Field
IR
External Condition
Inputs codes
Decoding circuits
AR
Control store
Next address I R
Microinstruction decoder
Control signals
Figure 7.22. Microinstruction-sequencing organization.
Microinstruction
F0 F1 F2 F3
F0 (8 bits) F1 (3 bits) F2 (3 bits) F3 (3 bits)
Address of next 000: No transfer 000: No transfer 000: No transfer
microinstruction 001: PCout 001: PCin 001: MARin
010: MDRout 010: IRin 010: MDRin
011: Zout 011: Zin 011: TEMPin
100: Rsrcout 100: Rsrcin 100: Yin
101: Rdstout 101: Rdstin
110: TEMP out
F4 F5 F6 F7
F4 (4 bits) F5 (2 bits) F6 (1 bit) F7 (1 bit)
0000: Add 00: No action 0: SelectY 0: No action
0001: Sub 01: Read 1: Select4 1: WMFC
10: Write
1111: XOR
F8 F9 F10
F8 (1 bit) F9 (1 bit) F10 (1 bit)
0: NextAdrs 0: No action 0: No action
1: InstDec 1: ORmode 1: ORindsrc
Figure 7.23. Format for microinstructions in the example of Section 7.5.3.
Implementation of the
Microroutine
Octal
address F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 F10
0 0 0 0 0 0 0 0 0 0 1 0 0 1 01 1 0 0 1 0 0 0 0 01 1 0 0 0 0
0 0 1 0 0 0 0 0 0 1 0 0 1 1 00 1 1 0 0 0 0 0 0 00 0 1 0 0 0
0 0 2 0 0 0 0 0 0 1 1 0 1 0 01 0 0 0 0 0 0 0 0 00 0 0 0 0 0
0 0 3 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 1 1 0
121 0 1 0 1 0 0 1 0 1 0 0 01 1 0 0 1 0 0 0 0 01 1 0 0 0 0
122 0 1 1 1 1 0 0 0 0 1 1 10 0 0 0 0 0 0 0 0 00 0 1 0 0 1
1 7 0 0 1 1 1 1 0 0 1 0 1 0 00 0 0 0 1 0 0 0 0 01 0 1 0 0 0
1 7 1 0 1 1 1 1 0 1 0 0 1 0 00 0 1 0 0 0 0 0 0 00 0 0 0 0 0
1 7 2 0 1 1 1 1 0 1 1 1 0 1 01 1 0 0 0 0 0 0 0 00 0 0 0 0 0
1 7 3 0 0 0 0 0 0 0 0 0 1 1 10 1 0 0 0 0 0 0 0 00 0 0 0 0 0
Figure 7.24. Implementation of the microroutine of Figure 7.21 using a
next-microinstruction address field. (See Figure 7.23 for encoded signals.)
R15in R15out R0 in R0out
Decoder
Decoder
IR Rsrc Rdst
InstDecout
External
inputs ORmode
Decoding
circuits
Condition ORindsrc
codes
AR
Control store
Next address F1 F2 F8 F9 F10
Rdstout
Rdstin
Microinstruction
decoder
Rsrcout
Rsrcin
Other control signals
Figure 7.25. Some details of the control-signal-generating circuitry.
bit-ORing
Further Discussions
Prefetching
Emulation