• To improve the performance of a CPU we have two options:
• Improve the hardware by introducing faster circuit
• Arrange the hardware such that more than one operation can be
performed at the same time.
• Since there is limit on the speed of hardware and the cost of faster
circuit is quite high, we must adopt the 2nd option.
• Instruction pipelining:
Pipelining • is a technique for implementing instruction –level parallelism,
within a single processor pipeline attempts to keep every part of
the processor busy with some instruction by dividing incoming
instruction into a series of sequential steps.
• Is a process of arrangement of hardware elements of the CPU
such that its overall performance is in increased . Simultaneous
execution of more than one instruction takes place in a pipelined
processor.
If time taken for executing one
instruction = t, then Time taken for
executing ‘n’ instructions = n x t
2. Pipelined
Execution-
Multiple instructions
are executed
parallelly.
In pipelined
architecture,
This style of
executing the
instructions is highly
efficient.
•A pipelined processor does not wait until the previous
instruction has executed completely. Rather, it fetches
the next instruction and begins its execution
Phase-Time Diagram-
• Phase-time diagram shows the execution of instructions in the
pipelined architecture.
• The following diagram shows the execution of three instructions in
four stage pipeline architecture.
Time taken to execute three instructions in
four stage pipelined architecture = 6 clock
cycles.
NOTE-
• In non-pipelined architecture,
• Time taken to execute three instructions
would be
• = No. of instructions x Time taken to execute
one instruction
• = 3 x 4 clock cycles
• = 12 clock cycles
• Clearly, pipelined execution of instructions is
far more efficient than non-pipelined
execution
Performance of • The following parameters serve as criterion to estimate the performance of
pipelined execution-
Pipelined • Speed Up: It gives an idea of “how much faster” the pipelined execution is
Execution-
as compared to non-pipelined execution. It is calculated as-
• The efficiency of pipelined execution is
calculated as-
2. Efficiency-
• Throughput is defined as number of
instructions executed per unit time. It is
calculated as-
3. Throughput-
Calculation of Important Parameters-
• Let us calculate certain important parameters of pipelined architecture.
Consider-
A pipelined architecture consisting of k-stage pipeline and Total number of
instructions to be executed = n
• Point-01: Calculating Cycle Time-
• In pipelined architecture,
• There is a global clock that synchronizes the working of all the stages.
• Frequency of the clock is set such that all the stages are synchronized.
• At the beginning of each clock cycle, each stage reads the data from its
register and process it. Cycle time is the value of one clock cycle.
There are two cases Case-01: All the stages offer same delay-
possible- If all the stages offer same delay, then-
Cycle time = Delay offered by one stage
including the delay due to its register
Case-02: All the stages do not offer same
delay-
If all the stages do not offer same delay,
then-
Cycle time = Maximum delay offered by any
stage including the delay due to its register
Point-02: Calculating Frequency Of Clock-
Frequency of the clock (f) = 1 / Cycle time
In pipelined architecture,
Calculating • Assume that we have n instruction,
Pipelined • Multiple instructions execute parallelly.
• Number of clock cycles taken by the first instruction = k clock
Execution cycles
Time- • After first instruction has completely executed, one instruction
comes out per clock cycle.
• Thus,
• Pipelined execution time
• = Time taken to execute first instruction + Time taken to execute
remaining instructions
• = 1 x k clock cycles + (n-1) x 1 clock cycle
• = (k + n – 1) *TP Where TP clock cycle time
Calculating • Speed up
• = Non-pipelined execution time / Pipelined execution time
Speed Up- • = n x k clock cycles / (k + n – 1) clock cycles
• = n x k / (k + n – 1)
PRACTICE • Problem-01:
PROBLEMS • Consider a pipeline having 4 phases(stages) with
BASED ON duration 60, 50, 90 and 80 ns. Given latch delay is
PIPELINING IN 10 ns. Calculate-
COMPUTER 1.Pipeline cycle time
ARCHITECTURE- 2.Non-pipeline execution time
3.Speed up ratio
4.Pipeline time for 1000 tasks
5.Sequential time for 1000 tasks
6.Throughput
• Given:
•Our stage pipeline is used
•Delay of stages = 60, 50, 90 and 80 ns
•Latch delay or delay due to each register = 10 ns
• Pipeline Cycle Time-
Solution • Cycle time = Maximum delay due to any stage + Delay
due to its register
• Cycle time= Max { 60, 50, 90, 80 } + 10 ns
• = 90 ns + 10 ns
• = 100 ns
• Non-Pipeline Execution Time-
• Non-pipeline execution time for one instruction= 60 ns +
50 ns + 90 ns + 80 ns = 280 ns
• Speed Up Ratio-
• Speed up= Non-pipeline execution time / Pipeline
execution time
Solution • = 280 ns / Cycle time
• = 280 ns / 100 ns
• = 2.8
• Pipeline time for 1000 tasks
Pipeline Time • = Time taken for 1st task + Time taken for remaining 999
tasks
For 1000 • = 1 x 4 clock cycles + 999 x 1 clock cycle
Tasks- • = 4 x cycle time + 999 x cycle time
• = 4 x 100 ns + 999 x 100 ns
• = 400 ns + 99900 ns
• = 100300 ns or
• (n+k-1)*Tp=(1000+4-1)*100=100300
Sequential Time For 1000 Tasks-
=(n*TP)
Non-pipeline time for 1000 tasks
= 1000 x Time taken for one task
= 1000 x 280 ns
= 280000 ns
Throughput-
Throughput for pipelined execution
= Number of instructions executed per unit
time
= 1000 tasks / 100300 ns