Chapter 4
The Processor
Chapter 4 — The Processor — 1
Hazards
• Situations that prevent starting the next instruction in the
next cycle
• Structure hazards
A required resource is busy
• Data hazard
Need to wait for previous instruction to complete its data
Chapter 4 — The Processor — 47
read/write
• Control hazard
Deciding on control action depends on previous instruction
Structure Hazards
• Conflict for use of a resource
• In MIPS pipeline with a single memory
Load/store requires data access
Instruction fetch would have to stall for that cycle
Would cause a pipeline “bubble”
Chapter 4 — The Processor — 48
• Hence, pipelined datapaths require separate instruction/data
memories
Or separate instruction/data caches
Laundry room example:
• if we used a washer-dryer combination instead of a separate washer
and dryer
• if our roommate was busy doing something else and wouldn’t put
clothes away
Shading Used in the Diagrams
Chapter 4 — The Processor — 50
Data Hazards
• Occurs when the pipeline must be stalled because one step
must wait or another to complete
• Data hazards arise from the dependence of one instruction on
an earlier one that is still in the pipeline
add $s0, $t0, $t1
sub $t2, $s0, $t3
Chapter 4 — The Processor — 51
Without intervention, a data hazard could severely stall the pipeline: the
add instruction doesn’t write its result until the fifth stage, meaning that
we would have to waste three clock cycles in the pipeline
Forwarding (aka Bypassing)
• Use result in ALU when it is computed
Don’t wait for it to be stored in a register
Requires extra connections in the datapath
There cannot be a valid forwarding path from the output of the
memory access stage in the first instruction to the input of
the execution stage of the following
Chapter 4 — The Processor — 52
replaces the value from register
$s0 read in the second stage of sub
Load-Use Data Hazard
• Can’t always avoid stalls by forwarding
If value not computed when needed
Chapter 4 — The Processor — 53
Code Scheduling to Avoid Stalls
• Reorder code to avoid use of load result in the next instruction
• C code for A = B + E; C = B + F;
Chapter 4 — The Processor — 54
lw $t1, 0($t0) lw $t1, 0($t0)
lw $t2, 4($t0) lw $t2, 4($t0)
stall add $t3, $t1, $t2 lw $t4, 8($t0)
sw $t3, 12($t0) add $t3, $t1, $t2
lw $t4, 8($t0) sw $t3, 12($t0)
stall add $t5, $t1, $t4 add $t5, $t1, $t4
sw $t5, 16($t0) sw $t5, 16($t0)
13 cycles 11 cycles
Control Hazards
• Arises from the need to make a decision based on the results
of one instruction while others are executing
• Branch determines flow of control
Fetching next instruction depends on branch outcome
Pipeline can’t always fetch correct instruction
Still working on ID stage of branch
Chapter 4 — The Processor — 55
• In MIPS pipeline
Need to compare registers and compute target early in the
pipeline
Add hardware to do it in ID stage
Stall on Branch
• Wait until branch outcome determined before fetching next
instruction
Chapter 4 — The Processor — 56
Branch taken: OR
Branch Prediction
• Longer pipelines can’t readily determine branch outcome early
Stall penalty becomes unacceptable
• Predict outcome of branch
Only stall if prediction is wrong
• In MIPS pipeline
Chapter 4 — The Processor — 57
Can predict branches not taken
Fetch instruction after branch, with no delay
MIPS with Predict Not Taken
Prediction
correct
Chapter 4 — The Processor — 58
Prediction
incorrect
More-Realistic Branch Prediction
• Static branch prediction
Based on typical branch behavior
Example: loop and if-statement branches
Predict backward branches taken
Predict forward branches not taken
• Dynamic branch prediction (> 90% accuracy)
Chapter 4 — The Processor — 59
Hardware measures actual branch behavior
e.g., record recent history of each branch
Assume future behavior will continue the trend
When wrong, stall while re-fetching, and update history
Again long pipelines suffer
Another approach
Chapter 4 — The Processor — 60
Pipeline Summary
• Pipelining improves performance by increasing instruction
throughput
Executes multiple instructions in parallel
Each instruction has the same latency
• Subject to hazards
Structure, data, control
Chapter 4 — The Processor — 61
• Instruction set design affects complexity of pipeline
implementation
§4.6 Pipelined Datapath and Control
reverse data movements influence only later instructions in the pipeline
MIPS Pipelined Datapath
Chapter 4 — The Processor — 62
MEM
Control
hazards
WB
data
hazards
Right-to-left
flow leads to
hazards
Pipeline registers
• Add registers to hold data so that portions of a single datapath
can be shared during instruction execution
Holds information produced in previous cycle
Chapter 4 — The Processor — 63
IF for Load, Store, …
Chapter 4 — The Processor — 65
• PC, IF/ID pipeline register = PC + 4
• in case it is needed later for an instruction (cannot know which type of
instruction is being fetched)
ID for Load, Store, …
Chapter 4 — The Processor — 66
• 16-bit immediate field (sign-extended to 32 bits)
• register numbers to read the two registers
• incremented PC address are stored in the ID/EX
EX for Load
Chapter 4 — The Processor — 67
• ALU: register 1 + sign-extended immediate from the
• ALU result is placed in the EX/MEM pipeline register
MEM for Load
Chapter 4 — The Processor — 68
• read the data memory (addr in EX/MEM pipeline register)
• load the data into the MEM/WB pipeline register
we need to preserve the destination register
number in the load instruction
WB for Load
Chapter 4 — The Processor — 69
Wrong
register
number
• read the data from the MEM/WB pipeline register
• write it into the register file
Corrected Datapath for Load
Chapter 4 — The Processor — 70
write register number → ID/EX register → EX/MEM register → MEM/WB register
EX for Store
Chapter 4 — The Processor — 71
MEM for Store
Chapter 4 — The Processor — 72
WB for Store
Chapter 4 — The Processor — 73
Multi-Cycle Pipeline Diagram
• Form showing resource usage
Chapter 4 — The Processor — 74
Multi-Cycle Pipeline Diagram
• Traditional form
Chapter 4 — The Processor — 75
Single-Cycle Pipeline Diagram
• State of pipeline in a given cycle
Chapter 4 — The Processor — 76
Pipelined Control (Simplified)
Chapter 4 — The Processor — 77
Pipelined Control
• Control signals derived from instruction
• Starts from EX stage (3 control lines in EX stage, rest is
forwarded)
Chapter 4 — The Processor — 78
Pipelined Control
Chapter 4 — The Processor — 79