RISC Processor Design
Multi-cycle Cycle Implementation:
MIPS
Virendra Singh
Indian Institute of Science
Bangalore
[email protected]Lecture 18
SE-273: Processor Design
Courtesy: Prof. Vishwani Agrawal
Mar 07, 2008
SE-273@SERC
16-20
11-15
Combined
Datapaths
0-15
Sign
ext.
Shift
left 2
0 mux 1
1 mux 0
ALU
zero
MemtoReg
MemWrite
MemRead
Data
mem.
ALU
Cont.
0-5
Mar 07, 2008
SE-273@SERC
0 mux 1
Instr.
mem.
ALU
PC
1 mux 0
21-25
1 mux 0
26-31
Branch
Reg. File
opcode
Jump
Shift
left 2
CONTROL
RegDst
Add
0-25
Time for Jump (J-Type)
z
z
z
z
z
ALU (R-type)
Load word (I-type)
Store word (I-type)
Branch on equal (I-type)
Jump (J-type)
Fetch (memory read)
Total
Mar 07, 2008
SE-273@SERC
6ns
8ns
7ns
5ns
2ns
2ns
4
How Fast Can the Clock Be?
z
If every instruction is executed in one clock
cycle, then:
Clock period must be at least 8ns to perform the
longest instruction, i.e., lw.
This is a single cycle machine.
It is slower because many instructions take less
than 8ns but are still allowed that much time.
Method of speeding up: Use multicycle
datapath.
Mar 07, 2008
SE-273@SERC
A Single Cycle Example
Delay of 1-bit full adder = 1ns
Clock period 32ns
a31
.
.
.
a2
a1
a0
b31
.
.
.
b2
b1
b0
1-b full
adder
1-b full
adder
1-b full
adder
1-b full
adder
0
Mar 07, 2008
Time of adding words ~ 32ns
Time of adding bytes ~ 32ns
SE-273@SERC
c32
s31
.
.
.
s2
s1
s0
a31
.
.
.
a2
a1
a0
b31
.
.
.
b2
b1
b0
Mar 07, 2008
Delay of 1-bit full adder = 1ns
Clock period 1ns
Time of adding words ~ 32ns
Time of adding bytes ~ 8ns
1-b full
adder
c32
FF
Initialize
to 0
SE-273@SERC
s31
.
.
.
s2
s1
s0
Shift
Shift
Shift
A Multicycle Implementation
Mar 07, 2008
ALUOut Reg.
ALU
A Reg.
B Reg.
Register file
Mem. Data (MDR)
Data
Instr. reg. (IR)
Addr.
Memory
PC
Multi-cycle Datapath
One-cycle data transfer paths
(need registers to hold data)
SE-273@SERC
Multi-cycle Datapath Requirements
z
z
z
Only one ALU, since it can be reused.
Single memory for instructions and data.
Five registers added:
Instruction register (IR)
Memory data register (MDR)
Three ALU registers, A and B for inputs and
ALUOut for output
Mar 07, 2008
SE-273@SERC
Multicycle Datapath
MUX
in1
control
in2
MemRead
MemWrite
Mar 07, 2008
IRWrite
RegDst
MemtoReg
0-15
0-5
SE-273@SERC
Sign
extend
ALUOut Reg.
ALU
ALUSrcB
28-31
ALUSrcA
A Reg.
11-15
16-20
Shift
left 2
RegWrite
B Reg.
out
21-25
Register file
Data
0-25
Mem. Data (MDR)
IorD
Memory
Addr.
PC
26-31 to
Control
FSM
Instr. reg. (IR)
PCWrite
etc.
PCSource
Shift
left 2
ALU
control
10
ALUOp
3 to 5 Cycles for an Instruction
Step
R-type
(4 cycles)
Mem. Ref.
(4 or 5 cycles)
Branch type
(3 cycles)
J-type
(3 cycles)
Instruction fetch
IR Memory[PC]; PC PC+4
Instr. decode/
Reg. fetch
A Reg(IR[21-25]); B Reg(IR[16-20])
ALUOut PC + (sign extend IR[0-15]) << 2
Execution,
addr. Comp.,
branch & jump
completion
Mem. Access
or R-type
completion
ALUOut
A op B
Reg(IR[1115])
ALUOut
Memory read
completion
Mar 07, 2008
ALUOut
A+sign extend
(IR[0-15])
If (A= =B)
then
PCALUOut
PCPC[2831]
||
(IR[0-25]<<2)
MDRM[ALUout]
or M[ALUOut]B
Reg(IR[16-20])
MDR
SE-273@SERC
11
Cycle 1 of 5: Instruction Fetch (IF)
z
Read instruction into IR, M[PC] IR
IorD
=
0
MemRead =
1
IRWrite
=
1
Increment PC, PC + 4 PC
Control signals used:
select PC
read memory
write IR
Control signals used:
ALUSrcA
ALUSrcB
ALUOp
PCSource
PCWrite
Mar 07, 2008
=
=
=
=
=
0
01
00
00
1
SE-273@SERC
select PC into ALU
select constant 4
ALU adds
select ALU output
write PC
12
Cycle 2 of 5: Instruction Decode (ID)
31-26
z
z
25-21
20-16
15-11
10-6
5-0
opcode | reg 1 | reg 2 | reg 3 | shamt | fncode
opcode | reg 1 | reg 2 | word address increment
opcode |
word address jump
Control unit decodes instruction
Datapath prepares for execution
R and I types, reg 1 A reg, reg 2 B reg
No control signals needed
Branch type, compute branch address in ALUOut
ALUSrcA
=0
select PC into ALU
ALUSrcB
= 11
Instr. Bits 0-15 shift 2 into ALU
ALUOp
= 00
ALU adds
Mar 07, 2008
SE-273@SERC
13
Cycle 3 of 5: Execute (EX)
z
R type: execute function on reg A and reg B, result in
ALUOut
Control signals used:
ALUSrcA
=
1
A reg into ALU
ALUsrcB
=
00
B reg into ALU
ALUOp
=
10
instr. Bits 0-5 control ALU
I type, lw or sw: compute memory address in
ALUOut A reg + sign extend IR[0-15]
Control signals used:
Mar 07, 2008
ALUSrcA
ALUSrcB
ALUOp
=
=
=
1
10
00
SE-273@SERC
A reg into ALU
Instr. Bits 0-15 into ALU
ALU adds
14
Cycle 3 of 5: Execute (EX)
z
I type, beq: subtract reg A and reg B, write ALUOut to PC
Control signals used:
ALUSrcA
=
1
A reg into ALU
ALUsrcB
=
00
B reg into ALU
ALUOp
=
01
ALU subtracts
If zero = 1, PCSource = 01
ALUOut to PC
If zero = 1, PCwriteCond =1
write PC
Instruction complete, go to IF
J type: write jump address to PC IR[0-25] shift 2 and
four leading bits of PC
Control signals used:
Mar 07, 2008
PCSource
=
10
PCWrite
=
1
Instruction complete,
go to IF
SE-273@SERC
write PC
15
Cycle 4 of 5: Reg Write/Memory
R type, write destination register from ALUOut
Control signals used:
RegDst
=
1
Instr. Bits 11-15 specify reg.
MemtoReg
=
0
ALUOut into reg.
RegWrite
=
1
write register
Instruction complete, go to IF
I type, lw: read M[ALUOut] into MDR
IorD
=
1 select ALUOut into mem adr.
MemRead
=
1 read memory to MDR
I type, sw: write M[ALUOut] from B reg
Control signals used:
Control signals used:
IorD
=
1
select ALUOut into mem adr.
MemWrite =
1
write memory
Mar 07, 2008
16
SE-273@SERC
Instruction complete,
go to IF
Cycle 5 of 5: Reg Write
z
I type, lw: write MDR to reg[IR(16-20)]
Control signals used:
RegDst
=
0
instr. Bits 16-20 are write reg
MemtoReg =
1
MDR to reg file write input
RegWrite =
1
read memory to MDR
Instruction complete, go to IF
For an alternative method of designing datapath, see
N. Tredennick, Microprocessor Logic Design, the Flowchart Method,
Digital Press, 1987.
Mar 07, 2008
SE-273@SERC
17
1-bit Control Signals
Signal name
Value = 0
Value =1
RegDst
Write reg. # = bit 16-20
Write reg. # = bit 11-15
RegWrite
No action
Write reg. Write data
ALUSrcA
First ALU Operand PC
First ALU OperandReg. A
MemRead
No action
Mem.Data OutputM[Addr.]
MemWrite
No action
M[Addr.]Mem. Data Input
MemtoReg
Reg.File Write InALUOut
Reg.File Write InMDR
IorD
Mem. Addr. PC
Mem. Addr. ALUOut
IRWrite
No action
IR Mem.Data Output
PCWrite
No action
PC is written
PCWriteCond
No action
PC is written if zero(ALU)=1
zero(ALU)
PCWriteCond
PCWrite
Mar 07, 2008
SE-273@SERC
PCWrite etc.
18
2-bit Control Signals
Signal name
ALUOp
ALUSrcB
PCSource
Value
Action
00
ALU performs add
01
ALU performs subtract
10
Funct. field (0-5 bits of IR ) determines ALU operation
00
Second input of ALU B reg.
01
Second input of ALU 4 (constant)
10
Second input of ALU 0-15 bits of IR sign ext. to 32b
11
Second input of ALU 0-15 bits of IR sign ext. and left shift
2 bits
00
ALU output (PC +4) sent to PC
01
ALUOut (branch target addr.) sent to PC
10
Jump address IR[0-25] shifted left 2 bits, concatenated with
PC+4[28-31], sent to PC
Mar 07, 2008
SE-273@SERC
19
Control: Finite State Machine
Start
State 0
Clock
cycle 1
Instruction fetch
State 1
Clock
cycle 2
Instruction decode and register fetch
FSM-M
Clock
cycles
3-5
Memory
access
instr.
Mar 07, 2008
FSM-R
FSM-B
R-type
instr.
Branch
instr.
SE-273@SERC
FSM-J
Jump
instr.
20
State 0: Instruction Fetch (CC1)
MUX
in1
control
in2
MemRead = 1
MemWrite
Mar 07, 2008
IRWrite
=1
RegDst
MemtoReg
Shift
left 2
0-15
0-5
SE-273@SERC
Sign
extend
ALUOut Reg.
ALU
ALUSrcB=01
28-31
ALUSrcA=0
A Reg.
RegWrite
B Reg.
out
16-20
21-25
Register file
Data
0-25
Mem. Data (MDR)
IorD=0
Memory
Addr.
PC
26-31 to
Control
FSM
Instr. reg. (IR)
PCWrite
etc.=1
PCSource=00
Add
Shift
left 2
ALU
control
21
ALUOp
=00
State 0 Control FSM Outputs
Start
State0
Instruction
fetch
MemRead =1
ALUSrcA = 0
IorD = 0
IRWrite = 1
ALUSrcB = 01
ALUOp = 00
PCWrite = 1
PCSource = 00
Mar 07, 2008
State 1
Instruction decode/
Register fetch/
Branch addr.
Outputs?
SE-273@SERC
22
State 1: Instr. Decode/Reg. Fetch/
Branch Address (CC2)
MUX
in1
control
in2
MemRead
MemWrite
Mar 07, 2008
IRWrite
RegDst
MemtoReg
Shift
left 2
0-15
0-5
SE-273@SERC
Sign
extend
ALUOut Reg.
ALU
ALUSrcB=11
28-31
ALUSrcA=0
A Reg.
RegWrite
B Reg.
out
16-20
21-25
Register file
Data
0-25
Mem. Data (MDR)
IorD
Memory
Addr.
PC
26-31 to
Control
FSM
Instr. reg. (IR)
PCWrite
etc.
PCSource
Add
Shift
left 2
ALU
control
23
ALUOp
= 00
State 1 Control FSM Outputs
Start
State0
Instruction
fetch
MemRead =1
(IF)
ALUSrcA = 0
IorD = 0
IRWrite = 1
ALUSrcB = 01
ALUOp = 00
PCWrite = 1
PCSource = 00
FSM-M
Mar 07, 2008
State 1
Instruction decode (ID) /
Register fetch /
Branch addr.
ALUSrcA = 0
ALUSrcB = 11
ALUOp = 00
e=
d
=
o
c
Op , sw ode
lw
c ype
p
O R- t
FSM-R
Opcode
= BEQ
FSM-B
SE-273@SERC
Opcode = J-type
FSM-J
24
State 1 (Opcode = lw) FSM-M (CC3-5)
PCSource
MUX
in1
control
in2
MemRead=1
MemWrite
Mar 07, 2008
IRWrite
RegDst=0
MemtoReg=1
0-15
0-5
SE-273@SERC
Sign
extend
CC3
ALU
ALUSrcB=10
ALUSrcA=1
A Reg.
28-31
ALUOut Reg.
CC5
B Reg.
out
16-20
Shift
left 2
RegWrite=1
21-25
Register file
Data
0-25
Mem. Data (MDR)
IorD=1
Memory
Addr.
PC
26-31 to
Control
FSM
Instr. reg. (IR)
PCWrite
etc.
CC4
Add
Shift
left 2
ALU
control
25
ALUOp
= 00
State 1 (Opcode= sw)FSM-M (CC3-4)
CC4
MUX
in1
control
IRWrite
CC4
in2
MemRead
MemWrite=1
Mar 07, 2008
RegDst=0
MemtoReg
Shift
left 2
0-15
0-5
SE-273@SERC
Sign
extend
ALUOut Reg.
CC3
ALU
ALUSrcB=10
28-31
ALUSrcA=1
A Reg.
RegWrite
B Reg.
out
16-20
21-25
Register file
Data
0-25
Mem. Data (MDR)
IorD=1
Memory
Addr.
PC
26-31 to
Control
FSM
Instr. reg. (IR)
PCWrite
etc.
PCSource
Add
Shift
left 2
ALU
control
26
ALUOp
= 00
FSM-M (Memory Access)
From state 1 Opcode = lw or sw
Opcode
Read
Memory data
Compute mem
addrress
ALUSrcA =1
ALUSrcB = 10 Opcode
ALUOp = 00
= lw
= sw
MemRead = 1
IorD = 1
Write
register
Write
memory
MemWrite = 1
IorD = 1
RegWrite = 1
MemtoReg = 1
RegDst = 0
Mar 07, 2008
SE-273@SERC
To state 0
(Instr. Fetch)
27
State 1(Opcode=R-type)FSM-R (CC3-4)
MUX
in1
control
in2
MemRead
MemWrite
Mar 07, 2008
IRWrite
RegDst=0
CC4
MemtoReg=0
0-15
0-5
SE-273@SERC
Sign
extend
ALUOut Reg.
ALU
ALUSrcB=00
A Reg.
CC3
28-31
ALUSrcA=1
11-15
Shift
left 2
RegWrite
B Reg.
out
16-20
21-25
Register file
Data
0-25
Mem. Data (MDR)
IorD
Memory
Addr.
PC
26-31 to
Control
FSM
Instr. reg. (IR)
PCWrite
etc.
PCSource
funct. code
Shift
left 2
ALU
control
28
ALUOp
= 10
FSM-R (R-type Instruction)
From state 1 Opcode = R-type
ALU
operation
ALUSrcA =1
ALUSrcB = 00
ALUOp = 10
Write
register
RegWrite = 1
MemtoReg = 0
RegDst = 1
Mar 07, 2008
SE-273@SERC
To state 0
(Instr. Fetch)
29
State 1 (Opcode = beq ) FSM-B (CC3)
MUX
in1
control
in2
MemRead
MemWrite
Mar 07, 2008
IRWrite
RegDst
MemtoReg
0-15
0-5
SE-273@SERC
Sign
extend
ALUOut Reg.
zero
ALU
ALUSrcB=00
CC3
28-31
ALUSrcA=1
11-15
A Reg.
16-20
PCSource
01
Shift
left 2
RegWrite
B Reg.
out
0-25
Register file
Data
21-25
Mem. Data (MDR)
IorD
Memory
Addr.
PC
26-31 to
Control
FSM
Instr. reg. (IR)
PCWrite
etc.=1
If(zero)
subtract
Shift
left 2
ALU
control
30
ALUOp
= 01
Write PC on zero
zero=1
PCWriteCond=1
PCWrite etc.=1
PCWrite
Mar 07, 2008
SE-273@SERC
31
FSM-B (Branch)
From state 1 Opcode = beq
Write PC on
branch condition
ALUSrcA =1
ALUSrcB = 00
ALUOp = 01
PCWriteCond=1
PCSource=01
Branch condition:
If A B=0
zero = 1
To state 0
(Instr. Fetch)
Mar 07, 2008
SE-273@SERC
32
State 1 (Opcode = j) FSM-J (CC3)
control
MUX
in1
in2
MemRead
MemWrite
Mar 07, 2008
IRWrite
RegDst
MemtoReg
0-15
Shift
left 2
Sign
extend
0-5
SE-273@SERC
ALUOut
Reg.
zero
ALU
ALUSrcB
A Reg.
28-31
ALUSrcA
11-15
PCSource
10
RegWrite
Register file
16-20
21-25
B Reg.
out
Dat
a
0-25
Mem. Data (MDR)
IorD
Memory
Addr.
26-31 to
Control
FSM
Instr. reg. (IR)
PC PCWrite
etc.
CC3
Shift
left 2
ALU
control
ALUOp
33
Write PC
zero
PCWriteCond
PCWrite etc.=1
PCWrite=1
Mar 07, 2008
SE-273@SERC
34
FSM-J (Jump)
From state 1 Opcode = jump
Write
jump addr. In PC
PCWrite=1
PCSource=10
To state 0
(Instr. Fetch)
Mar 07, 2008
SE-273@SERC
35
Control FSM
State 0
Start
1
Instr.
fetch/
adv. PC
rs
o
lw
2
Read
memory
data
Compute
memory
addr.
lw
Instr.
decode/reg.
fetch/branch
addr.
5
Write
register
Mar 07, 2008
ALU
operation
sw
4
J
Write
jump addr.
to PC
Write PC
on branch
condition
7
Write
memory
data
Write
register
SE-273@SERC
36
Control FSM (Controller)
6 inputs
(opcode)
Combinational
logic
Present
state
Reset
Clock
16 control
outputs
Next
state
FF
FF
FF
FF
Mar 07, 2008
SE-273@SERC
37
Designing the Control FSM
z
Encode states; need 4 bits for 10 states, e.g.,
State 0 is 0000, state 1 is 0001, and so on.
Write a truth table for combinational logic:
Opcode
000000
....
z
z
Present state
0000
....
Control signals
0001000110000100
....
Next state
0001
....
Synthesize a logic circuit from the truth table.
Connect four flip-flops between the next state outputs and
present state inputs.
Mar 07, 2008
SE-273@SERC
38
Block Diagram of a Processor
MemWrite
funct.
[0,5]
PCWriteCond
PCWrite
IRWrite
IorD
MemtoReg
RegWrite
ALUSrcB
2-bits
PCSource
2-bits
RegDst
ALUSrcA
Overflow
Opcode
6-bits
zero
Reset
Clock
Datapath
(PC, register file, registers, ALU)
Mem. Addr.
Mem. write data
Mem. data out
Mar 07, 2008
SE-273@SERC
ALU
control
39
ALUOp
3-bits
ALUOp
2-bits
Controller
(Control FSM)
MemRead
Exceptions or Interrupts
z
Conditions under which the processor may
produce incorrect result or may hang.
z
z
Illegal or undefined opcode.
Arithmetic overflow, divide by zero, etc.
Out of bounds memory address.
EPC: 32-bit register holds the affected instruction
address.
Cause: 32-bit register holds an encoded exception
type. For example,
0 for undefined instruction
1 for arithmetic overflow
Mar 07, 2008
SE-273@SERC
40
Implementing Exceptions
MUX
out
control
in2
Mar 07, 2008
EPC
Overflow to
Control FSM
0
1
EPCWrite=1
ALU
ALUSrcB=01
PC
Instr. reg. (IR)
26-31 to
Control
FSM
ALUSrcA=0
PCWrite
etc.=1
8000 0180(hex)
CauseWrite=1
in1
PCSource
11
Cause
32-bit
register
SE-273@SERC
Subtract
ALU
control
41
ALUOp
=01
How Long Does It Take? Again
z
Assume control logic is fast and does not
affect the critical timing. Major time
components are ALU, memory read/write,
and register read/write.
Time for hardware operations, suppose
Memory read or write
Register read
ALU operation
Register write
Mar 07, 2008
SE-273@SERC
2ns
1ns
2ns
1ns
42
Single-Cycle Datapath
z
z
z
z
z
z
z
R-type
Load word (I-type)
Store word (I-type)
Branch on equal (I-type)
Jump (J-type)
Clock cycle time
=
Each instruction takes one cycle
Mar 07, 2008
SE-273@SERC
6ns
8ns
7ns
5ns
2ns
8ns
43
Multicycle Datapath
z
Clock cycle time is determined by the
longest operation, ALU or memory:
Clock cycle time = 2ns
Cycles per instruction (CPI):
lw
sw
R-type
beq
j
Mar 07, 2008
5
4
4
3
3
SE-273@SERC
(10ns)
(8ns)
(8ns)
(6ns)
(6ns)
44
CPI of a Computer
k (Instructions of type k) CPIk
k (instructions of type k)
CPIk
Cycles for instruction of type k
Note:
CPI is dependent on the instruction mix of the
program being run. Standard benchmark programs
are used for specifying the performance of CPUs.
CPI
where
Mar 07, 2008
SE-273@SERC
45
Example
z
Consider a program containing:
loads
stores
branches
jumps
Arithmetic
CPI
CPI
Mar 07, 2008
25%
10%
11%
2%
52%
= 0.255 + 0.104 + 0.113 +
0.023 + 0.524
= 4.12 for multicycle datapath
= 1.00 for single-cycle datapath
SE-273@SERC
46
Multicycle vs. Single-Cycle
Performance ratio
Single cycle time / Multicycle time
(CPI cycle time) for single-cycle
(CPI cycle time) for multicycle
1.00 8ns
4.12 2ns
0.97
Single cycle is faster in this case, but remember, performance ratio
depends on the instruction mix.
Mar 07, 2008
SE-273@SERC
47
Alternate: CPU Implementation
and Microprogramming
Microprogram: An alternative
implementation of controller.
Mar 07, 2008
SE-273@SERC
48
Thank You
Mar 07, 2008
SE-273@SERC
49