0% found this document useful (0 votes)

47 views81 pages

Computer Architecture and Organization

Here are the key requirements for the MIPS datapath based on the instructions: - A 32-entry, 32-bit register file to store the 32 registers - Ability to read two registers from the register file for source operands - An ALU to perform arithmetic and logical operations on the source operands - Ability to write the result of the ALU operation back to a register This will allow the datapath to execute the ADD, SUB, and ORI instructions by reading registers, performing the operation, and writing the result back to a register.

Uploaded by

Aurangzeb Rashid Masud

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views81 pages

Computer Architecture and Organization

Uploaded by

Aurangzeb Rashid Masud

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 81

55:035

Computer Architecture and Organization

Lecture 9
Outline
 Building a CPU
 Basic Components
 MIPS Instructions
 Basic 5 Steps for CPU
 Single-Cycle Design
 Multi-cycle Design
 Comparison of Single and Multi-cycle Designs

55:035 Computer Architecture and Organization 2

Overview
 Brief look
 Digital logic

 CPU Datapath
 MIPS Example

55:035 Computer Architecture and Organization 3

Digital Logic
D-type Flip-flop Multiplexer
A
D Q 0
Clock
F
(edge- 1
triggered) B

S (Select input)

D-type Flip-flop with Enable

0 D Q
D Q Q EN
D 1 Clock
(edge-
Clock triggered)
EN (edge-
triggered)
(enable)

55:035 Computer Architecture and Organization 4

Digital Logic

1 Bit 4 Bits N Bits

D3 Q3
D Q D2 Q2 D Q
EN D1 Q1 EN
Clock Clock
(edge- D0 Q0 (edge-
triggered) triggered)
EN
Clock
(edge-
triggered)

Registers
55:035 Computer Architecture and Organization 5
Digital Logic
Tri-state Driver (Buffer)
In Drive Out
in out
0 0 Z
1 0 Z
0 1 0
drive 1 1 1

What is Z ??

55:035 Computer Architecture and Organization 6

Digital Logic
Adder/Subtractor or ALU
A B

Add/sub or ALUop
Carry-out Carry-in

55:035 Computer Architecture and Organization 7

Overview
 Brief look
 Digital logic

 How to Design a CPU Datapath

 MIPS Example

55:035 Computer Architecture and Organization 8

Designing a CPU: 5 Steps
 Analyze the instruction set  datapath requirements
 MIPS: ADD, SUB, ORI, LW, SW, BR
 Meaning of each instruction given by RTL (register transfers)
 2 types of registers: CPU/ISA registers, temporary registers

 Datapath requirements  select the datapath components

 ALU, register file, adder, data memory, etc

 Assemble the datapath

 Datapath must support planned register transfers
 Ensure all instructions are supported

 Analyze datapath control required for each instruction

 Assemble the control logic

55:035 Computer Architecture and Organization 9

Step 1a: Analyze ISA
 All MIPS instructions are 32 bits long.
 Three instruction formats:
31 26 21 16 11 6 0
 R-type op rs rt rd shamt funct
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
 I-type 31 26 21 16 0
op rs rt immediate
6 bits 5 bits 5 bits 16 bits
 J-type 31 26 0
op target address
6 bits 26 bits
 R: registers, I: immediate, J: jumps
 These formats intentionally chosen to simplify design

55:035 Computer Architecture and Organization 10

Step 1b: Analyze ISA
31 26 21 16 11 6 0
R- op rs rt rd shamt funct
type 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
31 26 21 16 0
I-type op rs rt immediate
6 bits 5 bits 5 bits 16 bits
31 26 0
J-type op target address
6 bits 26 bits
 Meaning of the fields:
 op: operation of the instruction
 rs, rt, rd: the source and destination register specifiers
 Destination is either rd (R-type), or rt (I-type)
 shamt: shift amount
 funct: selects the variant of the operation in the “op” field
 immediate: address offset or immediate value
 target address: target address of the jump instruction
55:035 Computer Architecture and Organization 11
MIPS ISA: subset for today
 ADD and SUB 31 26 21 16 11 6 0
 addU rd, rs, rt op rs rt rd shamt funct
 subU rd, rs, rt 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits

 OR Immediate: 31 26 21 16 0
 ori rt, rs, imm16 op rs rt immediate
6 bits 5 bits 5 bits 16 bits
 LOAD and STORE Word
 lw rt, rs, imm16
 sw rt, rs, imm16 31 26 21 16 0
op rs rt immediate
 BRANCH: 6 bits 5 bits 5 bits 16 bits
 beq rs, rt, imm16
31 26 21 16 0
op rs rt immediate
6 bits 5 bits 5 bits 16 bits

55:035 Computer Architecture and Organization 12

Step 2: Datapath Requirements
REGISTER FILE RdReg1
Register RdData1
 MIPS ISA requires 32 registers, 32b Numbers RdReg2
each (5 bits ea) REGFILE
 Called a register file WrReg
RdData2
 Contains 32 entries WrData
 Each entry is 32b
How to
 AddU rd,rs,rt or SubU rd,rs,rt
 Read two sources rs, rt implement? RegWrite
 Operation rs + rt or rs – rt
 Write destination rd ← rs+/-rt
Zero?
 Requirements
 Read two registers (rs, rt) Result
 Perform ALU operation
 Write a third register (rd) ALU

ALUop
55:035 Computer Architecture and Organization 13
Step 3: Datapath Assembly
 ADDU rd, rs, rt SUBU rd, rs, rt
 Need an ALU
 Hook it up to REGISTER FILE
 REGFILE has 2 read ports (rs,rt), 1 write port (rd)

Parameters rs RdReg1 Zero?

Come From RdData1
rt RdReg2
Instruction
REGFILE Result
Fields rd WrReg
RdData2
WrData ALU
Control Signals Depend
Upon Instruction Fields ALUop
RegWrite
Eg:
ALUop = f(Instruction)
= f(op, funct)
55:035 Computer Architecture and Organization 14
Steps 2 and 3: ORI Instruction
 ORI rt, rs, Imm16
 Need new ALUop for ‘OR’ function, hook up to REGFILE
 1 read port (rs), 1 write port (rt), 1 const value (Imm16)

rs RdReg1
RdData1
rt RdReg2 Zero?
From
Instruction REGFILE
X
rt rd WrReg Result
RdData2
WrData 0
ALU
ZERO- 1
Control Signals Imm16
Depend Upon RegWrite 16-bits EXTEND ALUop
Instruction Fields ALUsrc

E.g.:
ALUsrc = f(Instruction)
= f(op, funct)
55:035 Computer Architecture and Organization 15
Steps 2 and 3 Destination Register
 Must select proper destination, rd or rt
 Depends on Instruction Type
 R-type may write rd
 I-type may write rt

rs RdReg1
RdData1
rt RdReg2 Zero?
From 1 REGFILE
Instruction WrReg RdData2 Result
rd 0 WrData 0
ALU
ZERO- 1
Imm16
RegWrite 16-bits EXTEND ALUop
RegDst
ALUsrc

55:035 Computer Architecture and Organization 16

Steps 2 and 3: Load Word
 LW rt, rs, Imm16
 Need Data Memory: data ← Mem[Addr]
 Addr is rs+Imm16, Imm16 is signed, use ALU for +
 Store in rt: rt ← Mem[rs+Imm16]

rs RdReg1
RdData1
rt RdReg2 Zero?
DATAMEM
1 REGFILE
WrReg RdData2 Result Addr
rd 0 WrData 0 RdData 0
ALU
Imm16 SIGN/ 1
ZERO- 1
RegDst RegWrite EXTEND
ALUsrc ALUop MemtoReg

ExtOp
17
55:035 Computer Architecture and Organization
Steps 2 and 3: Store Word
 SW rt, rs, Imm16
 Need Data Memory: Mem[Addr] ← data
 Addr is rs+Imm16, Imm16 is signed, use ALU for +
 Store in Mem: Mem[rs+Imm16] ← rt

rs RdReg1
RdData1
rt RdReg2 Zero?
DATAMEM
1 REGFILE
WrReg Result Addr
RdData2
rd 0 WrData 0 RdData 1
ALU
WrData
Imm16 SIGN/ 1
ZERO- 0
RegWrite EXTEND
RegDst
ALUsrc ALUop MemWrite

ExtOp MemtoReg
55:035 Computer Architecture and Organization 18
Writes: Need to Control Timing
 Problem: write to data memory
 Data can come anytime
 Addr must come first
 MemWrite must come after Addr
 Else? writes to wrong Addr!

 Solution: use ideal data memory

 Assume everything works ok
 How to fix this for real?
 One solution: synchronous memory
 Another solution: delay MemWr to come late

 Problems?: write to register file

 Does RegWrite signal come after WrReg number?
 When does the write to a register happen?
 Read from same register as being written?

55:035 Computer Architecture and Organization 19

Missing Pieces: Instruction Fetching
 Where does the Instruction come from?
 From instruction memory, of course!

 Recall: stored-program concept

 Alternatives? How about hard-coding wires and switches…? This
is how ENIAC was programmed!

 How to branch?
 BEQ rs, rt, Imm16

55:035 Computer Architecture and Organization 20

Instruction Processing
 Fetch instruction
 Execute instruction

 Fetch next instruction

 Execute next instruction

 Fetch next instruction

 Execute next instruction

 Etc…

 How to maintain sequence? Use a counter!

 Branches (out of sequence) ? Load the counter!

55:035 Computer Architecture and Organization 21

Instruction Processing
 Program Counter
 Points to current instruction

 Address to instruction memory

 Instr ← InstrMem[PC]

 Next instruction: counts up by 4

 Remember: memory is byte-addressable, instructions are 4 bytes
 PC ← PC + 4

 Branch instruction: replace PC contents

55:035 Computer Architecture and Organization 22

Step 1: Analyze Instructions
 Register Transfer Language…
op | rs | rt | rd | shamt | funct = InstrMem[ PC ]
op | rs | rt | Imm16 = InstrMem[ PC ]

Instr Register Transfers

ADDU R[rd] ← R[rs] + R[rt]; PC ← PC + 4

SUBU R[rd] ← R[rs] – R[rt]; PC ← PC + 4

ORI R[rt] ← R[rs] + zero_ext(Imm16); PC ← PC + 4

LOAD R[rt] ← MEM[ R[rs] + sign_ext(Imm16)]; PC ← PC + 4

STORE MEM[ R[rs] + sign_ext(Imm16) ] ← R[rt]; PC ← PC + 4

BEQ if ( R[rs] == R[rt] ) then PC ← PC + 4 + { sign_ext(Imm16)] || b’00’ }

else
PC ← PC + 4
55:035 Computer Architecture and Organization 23
Steps 2 and 3: Datapath & Assembly

Add
4

PC Read
address

Instruction Instruction[31:0]
[31:0]
Instruction
Memory

 PC: a register
 Counter, counts by +4
 Provides address to Instruction Memory
55:035 Computer Architecture and Organization 24
Steps 2 and 3: Datapath & Assembly

0
M
u
Add Add x
4 Add 1
result
Shift
Left 2 PCSrc

Instruction[25:21]
PC Read
address
Instruction[20:16]
Instruction
[31:0]
Instruction Instruction[15:11]
Memory
PC: a register
 Counter, counts by +4
Instruction[15:0] (Imm16)
Sign/  Sometimes, must add
Zero
16 Extend 32
SignExtend{Imm16||b’00’} for
Note: the sign-extender for Imm16 branch instructions
is already in the datapath
ExtOp
(everything else is new) 25
Steps 2 and 3: Add Previous Datapath

0
M
u
Add Add x
4 Add 1
result
Shift
RegWrite PCSrc
Left 2

Instruction[25:21]
Read Read
PC reg. 1 Read
address
Instruction[20:16] data 1 MemtoReg
Read ALUSrc ALU Zero
Instruction reg. 2
[31:0] 0
Read ALU Read
M
Write 0 result Addr- 1
Instruction u data 2 M ess data M
Instruction[15:11] x reg. u u
Memory 1 x x
Write Register 1 0
data File
RegDst Write
data Data
Memory
Instruction[15:0] (Imm16)
Sign/
Zero ALU
16
Extend 32 Control
MemWrite

Instruction[5:0] (funct) ExtOp

ALUOp
What have we done?
 Created a simple CPU datapath
 Control still missing (next slide)

 Single-cycle CPU
 Every instruction takes 1 clock cycle
 Clocking ?

55:035 Computer Architecture and Organization 27

One Clock Cycle
 Clock Locations
 PC, REGFILE have clocks

 Operation
 On rising edge, PC will get new value
 Maybe REGFILE will have one value updated as well
 After rising edge
 PC and REGFILE can’t change
 New value out of PC
 Instruction out of INSTRMEM
 Instruction selects registers to read from REGFILE
 Instruction controls ALUop, ALUsrc, MemWrite, ExtOp, etc Lots to do


ALU does its work
DataMem may be read (depending on instruction)
in only
 Result value goes back to REGFILE 1 clock
 New PC value goes back to PC
 Await next clock edge cycle !!

55:035 Computer Architecture and Organization 28

Missing Steps?
 Control is missing (Steps 4 and 5 we mentioned earlier)
 Generate the green signals
 ALUsrc, MemWrite, MemtoReg, PCSrc, RegDst, etc
 These are all f(Instruction), where f() is a logic expression
 Will look at control strategies in upcoming lecture

 Implementation Details
 How to implement REGFILE?
 Read port: tristate buffers? Multiplexer? Memory?
 Two read ports: two of above?
 Write port: how to write only 1 register?
 How to control writes to memory? To register file?

 More instructions
 Shift instructions
 Jump instruction
 Etc

55:035 Computer Architecture and Organization 29

1-Cycle CPU Datapath
0
M
u
Add Add x
4 Add 1
result
Shift
RegWrite PCSrc
Left 2

Instruction[25:21]
Read Read
PC reg. 1 Read
address
Instruction[20:16] data 1 MemtoReg
Read ALUSrc ALU Zero
Instruction reg. 2
[31:0] 0
Read ALU Read
M
Write 0 result Addr- 1
Instruction u data 2 M ess data M
Instruction[15:11] x reg. u u
Memory 1 x x
Write Register 1 0
data File
RegDst Write
data Data
Memory
Sign/
Instruction[15:0] (Imm16) ALU
Zero
16 Extend 32 Control
MemWrite

Instruction[5:0] (funct) ExtOp

ALUOp
1-cycle CPU Datapath + Control

Add Add
4 Add
result
RegDst
Shift PCSrc
Left 2
Branch
Instruction MemRead
[31:26] Con- MemtoReg
trol ALUOp
MemWrite
ALUSrc
RegWrite

Instruction[25:21] Read
PC Read reg. 1 Read
address data 1
Instruction[20:16]
Read Zero
Instruction
[31:0]
reg. 2 ALU Read
Read ALU Addr-
Write data
Instruction data 2 result ess
Instruction[15:11] reg.
Memory Register Data
Write File Memory
data
Write
data
Sign/
Instruction[15:0] ALU
Zero
Extend control

Instruction[5:0]
1-cycle CPU Control – Lookup Table
Input or Output Signal Name R-format Lw Sw Beq
Op5 0 1 1 0
Op4 0 0 0 0
Op3 0 0 1 0
Inputs Op2 0 0 0 1
Op1 0 1 1 0
Op0 0 1 1 0
RegDst 1 0 X X
ALUSrc 0 1 1 0
MemtoReg 0 1 X X
RegWrite 1 1 0 0

Outputs MemRead 0 1 0 0
MemWrite 0 0 1 0
Branch 0 0 0 1
ALUOp1 1 0 0 0
ALUOp0 0 0 0 1

 Also: I-type instructions (ORI) & ExtOp (sign-extend control), etc.

1-cycle CPU + Jump Instruction
Instruction[25:0] Jump address [31..0]

PC + 4 [31..28]

Instruction
[31:26]

Instruction[25:21]

Instruction[20:16]

Instruction[15:11]

Instruction[15:0]

Instruction[5:0]
1-cycle CPU Problems?
 Every instruction 1 cycle
 Some instructions “do more work”
 Eg, lw must read from DATAMEM
 All instructions must have same clock period…

 Many instructions run slower than necessary

 Tricky timing on MemWrite, RegWrite(?) signals

 Write signal must come *after* address is stable

 Need extra resources…

 PC+4 adder, ALU for BEQ instruction, DATAMEM+INSTRMEM

55:035 Computer Architecture and Organization 34

Performance!
 Single-Cycle CPU Performance
 Execute one instruction per clock cycle (CPI=1)
 Clock cycle time? Note dataflow includes:
 INSTRMEM read
 REGFILE access
 Sign extension
 ALU operation
 DATAMEM read
 REGFILE/PC write
 Not every instruction uses all resources (eg, DATAMEM read)
 Can we change clock period for each instruction?
 No! (Why not?)
 One clock period: the worst case!
 This is why a single-cycle CPU is not good for performance

55:035 Computer Architecture and Organization 35

1-cycle CPU Datapath + Controller
Instruction[25:0] Jump address [31..0]

PC + 4 [31..28]

Instruction
[31:26]

Instruction[25:21]

Instruction[20:16]

Instruction[15:11]

Instruction[15:0]

Instruction[5:0]
1-cycle CPU Summary
 Operation
 1 cycle per instruction
 Control signals held fixed during entire cycle (except BRANCH)
 Only 2 registers
 PC, updated every clock cycle
 REGFILE, updated when required
 During clock cycle, data flows from register-outputs to register-inputs
 Fixed clock frequency / period

 Performance
 1 instruction per cycle
 Slowest instruction determines clock frequency

 Outstanding issue: MemWrite timing

 Assume this signal writes to memory at end of clock cycle

55:035 Computer Architecture and Organization 37

Multi-cycle CPU Goals
 Improve performance
 Break each instruction into smaller steps / multiple cycles
 LW instruction  5 cycles
 SW instruction  4 cycles
 R-type instruction  4 cycles
 Branch, Jump  3 cycles
 Aim for 5x clock frequency
 Complex instructions (eg, LW)  5 cycles  same performance as before
 Simple instructions (eg, ADD)  fewer cycles  faster

 Save resources (gates/transistors)

 Re-use ALU over multiple cycles
 Put INSTR + DATA in same memory

 MemWrite timing solved?

55:035 Computer Architecture and Organization 38

Multi-cycle CPU Datapath

PC
M Instruction
RdReg1 M
u Address [25:21]
x RdData1 A u
Instruction x
Memory [20:16] RdReg2 ALU Zero
MemData Instruction Registers ALU ALU
M
[15:0] Instruction u Write result Out
Write [15:11] x reg RdData2 B
Instruction M
data 4 u
Register Write x
M data
Instr[15:0] u
x

Memory Sign Shift

Data Extend Left 2
Register

Instruction[5:0]

 Add multiplexers + control signals (IorD, MemtoReg, ALUSrcA, ALUSrcB)

 Move signal paths (+4, Shift Left 2)
Multi-cycle CPU Datapath

Memory Sign Shift

Data Extend Left 2
Register

Instruction[5:0]

 Add registers + control signals (IR, MDR, A, B, ALUOut)

 Registers with no control signal load value every clock cycle (eg, PC)
Instruction Execution Example
 Execute a “Load Word” instruction
 LW rt, 0(rs)

 5 Steps
1. Fetch instruction
2. Read registers
3. Compute address
4. Read data
5. Write registers

55:035 Computer Architecture and Organization 41

Load Word Instruction Sequence

Memory Sign Shift

Data Extend Left 2
Register

Instruction[5:0]

1. Fetch Instruction
InstructionRegister ← Mem[PC]
Load Word Instruction Sequence

Memory Sign Shift

Data Extend Left 2
Register

Instruction[5:0]

2. Read Registers
A ← Registers[Rs]
Load Word Instruction Sequence

Memory Sign Shift

Data Extend Left 2
Register

Instruction[5:0]

3. Compute Address
ALUOut ← A + {SignExt(Imm16),b’00’}
Load Word Instruction Sequence

Memory Sign Shift

Data Extend Left 2
Register

Instruction[5:0]

4. Read Data
MDR ← Memory[ALUOut]
Load Word Instruction Sequence

Memory Sign Shift

Data Extend Left 2
Register

Instruction[5:0]

5. Write Registers
Registers[Rt] ← MDR
Load Word Instruction Sequence

PC
M Instruction RdReg1
u Address [25:21] M
x RdData1 A u
Instruction x
Memory [20:16] RdReg2 ALU Zero
MemData Instruction Registers ALU ALU
M
[15:0] Instruction u Write result Out
Write [15:11] x reg RdData2 B
Instruction M
data 4 u
Register Write x
M data
Instr[15:0] u
x

Memory Sign Shift

Data Extend Left 2
Register

Instruction[5:0]

All 5 Steps Shown

Multi-cycle Load Word: Recap
1. Fetch Instruction InstructionRegister ← Mem[PC]

2. Read RegistersA ← Registers[Rs]

3. Compute Address ALUOut ← A + {SignExt(Imm16)}

4. Read Data MDR ← Memory[ALUOut]

5. Write Registers Registers[Rt] ← MDR

 Missing Steps?


55:035 Computer Architecture and Organization 48

Multi-cycle Load Word: Recap
1. Fetch Instruction InstructionRegister ← Mem[PC]; PC ← PC + 4

2. Read RegistersA ← Registers[Rs]

3. Compute Address ALUOut ← A + {SignExt(Imm16)}

4. Read Data MDR ← Memory[ALUOut]

5. Write Registers Registers[Rt] ← MDR

 Missing Steps?
 Must increment the PC
 Do it as part of the instruction fetch (in step 1)
 Need PCWrite control signal

55:035 Computer Architecture and Organization 49

Multi-cycle R-Type Instruction
1. Fetch Instruction InstructionRegister ← Mem[PC]; PC ← PC + 4

2. Read Registers A ← Registers[Rs]; B ← Registers[Rt]

3. Compute Value ALUOut ← A op B

4. Write Registers Registers[Rd] ← ALUOut

 RTL describes data flow action in each clock cycle

 Control signals determine precise data flow
 Each step implies unique control values

55:035 Computer Architecture and Organization 50

Multi-cycle R-Type Instruction:
Control Signal Values
1. Fetch Instruction InstructionRegister ← Mem[PC]; PC ← PC + 4
MemRead=1, ALUSrcA=0, IorD=0, IRWrite,
ALUSrcB=01, ALUop=00, PCWrite, PCSource=00

2. Read Registers A ← Registers[Rs]; B ← Registers[Rt]

ALUSrcA=0, ALUSrcB=11, ALUop=00

3. Compute Value ALUOut ← A op B

ALUSrcA=1, ALUSrcB=00, ALUop=10

4. Write Registers Registers[Rd] ← ALUOut

RegDst=1, RegWrite, MemtoReg=0

 Each step implies unique control values

 Fixed for entire cycle
 “Default value” implied if unspecified

55:035 Computer Architecture and Organization 51

Check Your Work – Is RTL Valid ?
1. Datapath check
 Within one cycle…
 Each cycle has valid data flow path (path exists)
 Each register gets only one new value
 Across multiple cycles…
 Register value is defined before use in previous (earlier in time) clock cycle
 Eg, “A  3” must occur before “B  A”
 Make sure register value doesn’t disappear if set >1 cycle earlier

2. Control signal check

 Each cycle, RTL describing the datapath flow implies a value for each control
signal
 0 or 1 or default or don’t care
 Each control signal gets only one fixed value the entire cycle

3. Overall check
 Does the sequence of steps work ?

55:035 Computer Architecture and Organization 52

Multi-cycle BEQ Instruction
1. Fetch Instruction
InstructionRegister ← Mem[PC]; PC ← PC + 4

2. Read Registers, Precompute Target

A ← Registers[Rs] ; B ← Registers[Rt] ; ALUOut ← PC + {SignExt{Imm16},b’00’}

3. Compare Registers, Conditional Branch

if( (A – B) ==0 ) PC ← ALUOut

Green shows PC calculation flow (in parallel with other operations)

55:035 Computer Architecture and Organization 53

Multi-cycle Datapath with Control Signals
PCSrc
PCWrite IRWrite

IorD RegWrite ALUSrcA

Jump
MemRead address
[31..0]
Instr[25:0]
RegDst PC[31..28]
Instr[25:21]

Instr[20:16]

Instr[15:0]
In[15:11]

Instr[15:0]

ALU
MemWrite
Control

MemtoReg ALUSrcB
Instruction[5:0]
ALUOp

55:035 Computer Architecture and Organization 54

Multi-cycle Datapath with Controller

Instr.
[31:26]
Jump
address
[31..0]
Instr[25:0]
Instr[31:26]
PC[31..28]
Instr[25:21]

Instr[20:16]

Instr[15:0]
In[15:11]

Instr[15:0]

Instruction[5:0]
Multi-cycle BEQ Instruction
1. Fetch Instruction
InstructionRegister ← Mem[PC]; PC ← PC + 4

2. Read Registers, Precompute Target

A ← Registers[Rs] ; B ← Registers[Rt] ; ALUOut ← PC + {SignExt{Imm16},b’00’}

3. Compare Registers, Conditional Branch

if( (A – B) ==0 ) PC ← ALUOut

Green shows PC calculation flow (in parallel with other operations)

55:035 Computer Architecture and Organization 56

Multi-cycle Datapath with Control Signals
PCSrc
PCWrite IRWrite

IorD RegWrite ALUSrcA

Jump
MemRead address
[31..0]
Instr[25:0]
RegDst PC[31..28]
Instr[25:21]

Instr[20:16]

Instr[15:0]
In[15:11]

Instr[15:0]

ALU
MemWrite
Control

MemtoReg ALUSrcB
Instruction[5:0]
ALUOp

55:035 Computer Architecture and Organization 57

Multi-cycle Datapath with Controller

Instr.
[31:26]
Jump
address
[31..0]
Instr[25:0]
Instr[31:26]
PC[31..28]
Instr[25:21]

Instr[20:16]

Instr[15:0]
In[15:11]

Instr[15:0]

Instruction[5:0]
Multi-cycle CPU Control: Overview

Control
Signal
Outputs

 General approach: Finite State Machine (FSM)

 Need details in each branch of control…
 Precise outputs for each state (Mealy depends on inputs, Moore does not)
 Precise “next state” for each state (can depend on inputs)

55:035 Computer Architecture and Organization 59

How to Implement FSM ?
 Manually with logic gates + FFs
 Bubble diagram, next-state table, state assignment
 Karnaugh map for each state bit, each output bit (painful!)

 High-level language description (eg, Verilog, VHDL)

 Describe FSM bubble diagram (next-states, output values)
 Automatically synthesized into gates + FFs

 Microcode (µ-code) description

 Sequence through many µ-ops for each CPU instruction
 One µ-op (µ-instruction) sends correct control signal for 1 cycle
 µ-op similar to one bubble in FSM
 Acts like a mini-CPU within a CPU
 µPC: microcode program counter
 Microcode storage memory contains µ-ops
 Can look similar to RTL or some new “assembly language”

55:035 Computer Architecture and Organization 60

FSM Specification: Bubble Diagram
Can build this
by examining
RTL

It is possible to
automatically
convert RTL
into this form !

61
FSM: Gates + FFs Implementation

FSM
High-level
Organization

55:035 Computer Architecture and Organization 62

FSM: Microcode Implementation
Microcode
Storage
(memory) Datapath
control
Outputs
outputs

Inputs
1

Sequencing
Microprogram Counter control
Adder

Address Select Logic

Inputs from instruction

55:035 Computer Architecture and Organization 63

Multi-cycle CPU with Control FSM
Conditional
Branch
FSM
Control
Outputs
Instr.
[31:26]
Jump
address
[31..0]
Instr[25:0]
Instr[31:26]
PC[31..28]
Instr[25:21]

Instr[20:16]

Instr[15:0]
In[15:11]

Instr[15:0]

Instruction[5:0]
Control FSM: Overview

 General approach: Finite State Machine (FSM)

 Need details in each branch of control…

55:035 Computer Architecture and Organization 65

Detailed FSM

66
Detailed FSM
Instruction
Fetch

R-Type Branch Jump

Memory
Reference

67
Detailed FSM: Instruction Fetch

55:035 Computer Architecture and Organization 68

Detailed FSM: Memory Reference

LW SW

69
Detailed FSM: R-Type Instruction

55:035 Computer Architecture and Organization 70

Detailed FSM: Branch Instruction

55:035 Computer Architecture and Organization 71

Detailed FSM: Jump Instruction

55:035 Computer Architecture and Organization 72

Performance Comparison

Single-cycle CPU
vs
Multi-cycle CPU

55:035 Computer Architecture and Organization 73

Simple Comparison
1 clock cycle
Single-cycle CPU All

5 clock cycles
Multi-cycle CPU LW

4 clock cycles
Multi-cycle CPU SW, R-type

3 clock cycles
Multi-cycle CPU BEQ, J
What’s really happening?
Single-cycle CPU

Ideally:
Calc
Fetch Decode Memory Write
Addr
( Load Word Instruction )

Multi-cycle CPU

55:035 Computer Architecture and Organization 75

In practice, steps differ in speeds…
Load Word Instruction

Single-cycle CPU
Calc
Fetch Decode Memory Write
Addr

Wasted time! Violation!

Multi-cycle CPU
Calc
Fetch Decode Memory
Addr Write
55:035 Computer Architecture and Organization 76
Single-cycle vs Multi-cycle
LW instruction faster for single-cycle
Single-cycle CPU
Calc
Fetch Decode Memory Write
Addr

Now wasted time is larger! Violation fixed!

Multi-cycle CPU
Calc
Fetch Decode Memory Write
Addr
55:035 Computer Architecture and Organization 77
Single-cycle vs Multi-cycle
SW instruction ~ same speed
Single-cycle CPU
Calc
Fetch Decode Memory
Addr
Speed diff

Wasted time!

Multi-cycle CPU
Calc
Fetch Decode Memory
Addr
55:035 Computer Architecture and Organization 78
Single-cycle vs Multi-cycle
BEQ, J instruction faster for multi-cycle
Single-cycle CPU
Calc
Fetch Decode
Addr
Speed diff

Wasted time!

Multi-cycle CPU
Calc
Fetch Decode
Addr
55:035 Computer Architecture and Organization 79
Performance Summary
 Which CPU implementation is faster?
 LW  single-cycle is faster
 SW,R-type  about the same
 BEQ,J  multi-cycle is faster

 Real programs use a mix of these instructions

 Overall performance depends instruction frequency !

55:035 Computer Architecture and Organization 80

Implementation Summary
 Single-cycle CPU
 1 instruction per cycle (eg, 1MHz  1 MIPS)
 No “wasted time” on most complex instruction
 Large wasted time on simpler instructions
 Simple controller (just a lookup table or memory)
 Simple instructions

 Multi-cycle CPU
 << 1 instruction per cycle (eg, 1MHz  0.2 MIPS)
 Small time wasted on most complex instruction
 Hence, this instruction always slower than single-cycle CPU
 Small time wasted on simple instructions
 Eliminates “large wasted time” by using fewer clock cycles
 Complex controller (FSM)
 Potential to create complex instructions

55:035 Computer Architecture and Organization 81

Pringles Marketing Analysis
100% (2)
Pringles Marketing Analysis
24 pages
Wallmart Project Report
79% (24)
Wallmart Project Report
26 pages
AC Service Unit: Repair Instructions
100% (1)
AC Service Unit: Repair Instructions
29 pages
Lecture 15: Processor Design: Why Everything You've Learned Matters
No ratings yet
Lecture 15: Processor Design: Why Everything You've Learned Matters
29 pages
L11 Processor Datapath
No ratings yet
L11 Processor Datapath
28 pages
Fundamentals of Processor Design: Using Figures From by Hamblen and Furman
No ratings yet
Fundamentals of Processor Design: Using Figures From by Hamblen and Furman
35 pages
Computer Architecture Basics
No ratings yet
Computer Architecture Basics
72 pages
Microprocessor Microcontroller Architecture
No ratings yet
Microprocessor Microcontroller Architecture
46 pages
MIPS Single-Cycle Processor Design
No ratings yet
MIPS Single-Cycle Processor Design
53 pages
CCEE 324 Computer Organization Chapter 5 Lecture 18
No ratings yet
CCEE 324 Computer Organization Chapter 5 Lecture 18
63 pages
CENG 331 Course Slides Chapter 5
No ratings yet
CENG 331 Course Slides Chapter 5
73 pages
Chapter1 - Basic Structure of Computers
100% (1)
Chapter1 - Basic Structure of Computers
119 pages
Chapter1 - Basic Structure of Computers
No ratings yet
Chapter1 - Basic Structure of Computers
119 pages
CSC 222 - Lecture Notes 1
No ratings yet
CSC 222 - Lecture Notes 1
14 pages
LD and CO Module 3
No ratings yet
LD and CO Module 3
74 pages
CS M151B / EE M116C: Computer Systems Architecture
No ratings yet
CS M151B / EE M116C: Computer Systems Architecture
34 pages
Lec12 DataPath
No ratings yet
Lec12 DataPath
43 pages
Basicfunctionalunit 190124043726
No ratings yet
Basicfunctionalunit 190124043726
37 pages
Chapter 3 General-Purpose Processors: Software
No ratings yet
Chapter 3 General-Purpose Processors: Software
44 pages
By H.V.Rama Krishna Murthy Sse/Elec
No ratings yet
By H.V.Rama Krishna Murthy Sse/Elec
45 pages
Lect 07 Processordesign PDF
No ratings yet
Lect 07 Processordesign PDF
55 pages
Cmps343cpu Parta
No ratings yet
Cmps343cpu Parta
25 pages
Week2 Comparch
No ratings yet
Week2 Comparch
53 pages
MIPS CPU Data Path Design Guide
No ratings yet
MIPS CPU Data Path Design Guide
8 pages
I.T Tools
No ratings yet
I.T Tools
49 pages
Instruction Set Architecture
No ratings yet
Instruction Set Architecture
7 pages
Basic Computer Organization and Design
No ratings yet
Basic Computer Organization and Design
62 pages
Chapter1 - Basic Structure of Computers
No ratings yet
Chapter1 - Basic Structure of Computers
123 pages
ch3 COA
No ratings yet
ch3 COA
46 pages
Module 1 (BKM)
No ratings yet
Module 1 (BKM)
30 pages
SJB Institute of Technology: CO & ARM Microcontrollers (21EC52)
No ratings yet
SJB Institute of Technology: CO & ARM Microcontrollers (21EC52)
238 pages
CH 5
No ratings yet
CH 5
68 pages
Avr A & A: Rchitecture Ssembly
No ratings yet
Avr A & A: Rchitecture Ssembly
45 pages
Computer Organization: Submitted By: Shaveta Gupta (IT)
No ratings yet
Computer Organization: Submitted By: Shaveta Gupta (IT)
66 pages
2021 Chapter 2 CPU Lecture 2 Stud
No ratings yet
2021 Chapter 2 CPU Lecture 2 Stud
65 pages
SKV Unit 1 COA
No ratings yet
SKV Unit 1 COA
67 pages
Computer Architecture and Organization
No ratings yet
Computer Architecture and Organization
11 pages
Computer Architectures and Organisation
No ratings yet
Computer Architectures and Organisation
106 pages
Computer Organization2 292883412
No ratings yet
Computer Organization2 292883412
66 pages
Module 1
No ratings yet
Module 1
50 pages
MIPS Single Cycle Datapath Guide
No ratings yet
MIPS Single Cycle Datapath Guide
61 pages
Chapter 04
No ratings yet
Chapter 04
169 pages
Lecture 11
No ratings yet
Lecture 11
37 pages
6 Computer Architecture and Organization
No ratings yet
6 Computer Architecture and Organization
65 pages
CA I - Chapter 3 RISC V Processor
No ratings yet
CA I - Chapter 3 RISC V Processor
107 pages
Welcome
No ratings yet
Welcome
58 pages
Chapter 11 Single Cycle Datapath
No ratings yet
Chapter 11 Single Cycle Datapath
17 pages
COMP
No ratings yet
COMP
37 pages
Lecture11 New
No ratings yet
Lecture11 New
31 pages
UNIT 1-Registers
No ratings yet
UNIT 1-Registers
80 pages
Csa 3marks Short Answers
No ratings yet
Csa 3marks Short Answers
3 pages
Dld4mod Temp
No ratings yet
Dld4mod Temp
48 pages
5 Singlecycle
No ratings yet
5 Singlecycle
60 pages
Computer Organization Unit-1
No ratings yet
Computer Organization Unit-1
147 pages
DCO Module - 3
No ratings yet
DCO Module - 3
25 pages
Isa 1
No ratings yet
Isa 1
13 pages
Module 1
No ratings yet
Module 1
53 pages
Unit 1
No ratings yet
Unit 1
83 pages
Computer Organization and Architecture
No ratings yet
Computer Organization and Architecture
30 pages
Lecture 2 - Shlomo - CE
No ratings yet
Lecture 2 - Shlomo - CE
110 pages
High Voltage Engineering: Prof. Dr. Magdi M. El-Saadawi
No ratings yet
High Voltage Engineering: Prof. Dr. Magdi M. El-Saadawi
23 pages
Litebox: Advanced 3-Channel EMG & EP System
No ratings yet
Litebox: Advanced 3-Channel EMG & EP System
7 pages
Wu2019 PDF
No ratings yet
Wu2019 PDF
5 pages
20th Century Semiconductor Innovations
No ratings yet
20th Century Semiconductor Innovations
49 pages
How To Improve Your Apache Web Server's Performance?
No ratings yet
How To Improve Your Apache Web Server's Performance?
2 pages
Project Year 12 English
No ratings yet
Project Year 12 English
7 pages
Visualizing Association Rules: Introduction To The R-Extension Package Arulesviz
No ratings yet
Visualizing Association Rules: Introduction To The R-Extension Package Arulesviz
24 pages
PNB vs. CA 217 Scra 347
100% (1)
PNB vs. CA 217 Scra 347
2 pages
Carrier BacnetSC Setup Guide
No ratings yet
Carrier BacnetSC Setup Guide
27 pages
Local Media7707301369137256841
No ratings yet
Local Media7707301369137256841
33 pages
Committed Vs Aspirational OKRs The Idea OKRE V1 0
No ratings yet
Committed Vs Aspirational OKRs The Idea OKRE V1 0
3 pages
LED LIGHTING Research Report Abstract
0% (1)
LED LIGHTING Research Report Abstract
14 pages
Personal Details:: A Study On "Recruitment and Selection Practices On Sterling Resorts Private Limited, Kodaikanal
No ratings yet
Personal Details:: A Study On "Recruitment and Selection Practices On Sterling Resorts Private Limited, Kodaikanal
6 pages
Kuwait's Growing F&B Market
No ratings yet
Kuwait's Growing F&B Market
2 pages
Investment Math Syllabus
No ratings yet
Investment Math Syllabus
7 pages
School Space Allocation Guide
No ratings yet
School Space Allocation Guide
5 pages
LPM 211 Poultry
No ratings yet
LPM 211 Poultry
214 pages
Provantage 1
No ratings yet
Provantage 1
220 pages
Enzymes in Industrial Applications
No ratings yet
Enzymes in Industrial Applications
18 pages
Solar & Crank Emergency Radio Guide
100% (2)
Solar & Crank Emergency Radio Guide
28 pages
Foundation (NCA) Sample PAGES 1
No ratings yet
Foundation (NCA) Sample PAGES 1
3 pages
Axial Stress and Strain Guide
No ratings yet
Axial Stress and Strain Guide
3 pages
I-Sem-Marketing Management
No ratings yet
I-Sem-Marketing Management
2 pages
EMA Literature Review Guide
No ratings yet
EMA Literature Review Guide
7 pages
Digital Paddlewheel Flow Meter: Features
No ratings yet
Digital Paddlewheel Flow Meter: Features
4 pages
57 Brochure
No ratings yet
57 Brochure
42 pages
Intel - RKL-S Plamform: System Chipset: Cpu
No ratings yet
Intel - RKL-S Plamform: System Chipset: Cpu
71 pages
Alshooaa Althaqib Company For General Trading, Contracting and Technical Services LTD
No ratings yet
Alshooaa Althaqib Company For General Trading, Contracting and Technical Services LTD
12 pages
The 7 Balkan Conference On Operational Research Constanta, May 2005, Romania
No ratings yet
The 7 Balkan Conference On Operational Research Constanta, May 2005, Romania
11 pages
Scalability PDF
No ratings yet
Scalability PDF
21 pages
Preface To IGAS and IGFRS
No ratings yet
Preface To IGAS and IGFRS
5 pages