0% found this document useful (0 votes)

33 views34 pages

UNIT-4 - Pipelining & Parallel Processing

Uploaded by

gedelaarjun333

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views34 pages

UNIT-4 - Pipelining & Parallel Processing

Uploaded by

gedelaarjun333

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Pipelining and Vector Processing 1

PIPELINING AND VECTOR PROCESSING

• Parallel Processing

• Pipelining

• Arithmetic Pipeline

• Instruction Pipeline

• RISC Pipeline

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 2 Parallel Processing

PARALLEL PROCESSING

Parallel processing is a term used for a large class

of techniques that are used to provide
simultaneous data-processing tasks for the
purpose of increasing the computational speed of
a computer system.

Instead of processing each instruction

sequentially as in a conventional computer.

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 3

• Ex: while an instruction is being executed in the ALU, the

next instruction can be read from memory.

• The system may have two or more ALUs and be able to

execute two or more instructions at the same time.

• Purpose: To increase the throughput

• Throughput: The amount of processing that can be

accomplished during a given interval of time.

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 4

• The amount of hardware increases with parallel processing

, and with it, the cost of the system increases.

• However, technologies developments have reduced

hardware costs to the point where parallel processing
techniques are economically feasible.

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 5

PARALLEL PROCESSING
• Example of parallel Processing:
– Multiple Functional Unit:
Separate the execution unit into
eight functional units operating in
parallel.

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 6

• Above figure shows one possible way of separating the

execution unit into eight functional units operating in
parallel.
• Arithmetic operations with integers: Adder-Subtractor,
Integer multiplier.
• All units are independent of each other.

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 7

• Parallel processing can be classified as:

(i) Internal organization of the processors
(ii) Interconnection between processors
(iii) Flow of information through the system.

• M.J.Flynn: Organization of a computer system by the

number of instruction and data items that are manipulated
simultaneously.
• The sequence of instructions read from memory constitutes
an instruction stream. The operations performed on the
data in the processor constitutes a data stream.
• Flynn’s classification divides computers into four major
groups.

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 8 Parallel Processing

PARALLEL COMPUTERS
Architectural Classification
– Flynn's classification
» Based on the multiplicity of Instruction Streams and Data Streams
» Instruction Stream
• Sequence of Instructions read from memory
» Data Stream
• Operations performed on the data in the processor

Number of Data Streams

Single Multiple

Number of Single SISD SIMD

Instruction
Streams Multiple MISD MIMD

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 9 Parallel Processing

SISD COMPUTER SYSTEMS

Control Processor Data stream Memory

Unit Unit

Instruction stream

• Characteristics:
Ø One control unit, one processor unit, and one memory unit
Ø Parallel processing may be achieved by means of:
ü multiple functional units
ü pipeline processing

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 10 Parallel Processing

MISD COMPUTER SYSTEMS

M CU P

M CU P Memory
• •
• •
• •

M CU P Data stream

Instruction stream

Characteristics

- There is no computer at present that can be classified as

MISD

- Only theoretical interest since no practical system has been

constructed using this organization.

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 11 Parallel Processing

SIMD COMPUTER SYSTEMS

Memory
Data bus

Control Unit
Instruction stream

P P ••• P Processor units

Data stream

Alignment network

M M ••• M Memory modules

• Characteristics
Ø Only one copy of the program exists
Ø All processors receive the same instruction from the control
unit but operate on different items of data.
Computer Organization Computer Architectures Lab
Pipelining and Vector Processing 12 Parallel Processing

MIMD COMPUTER SYSTEMS

P M P M ••• P M

Interconnection Network

Shared Memory

• Characteristics:
Ø Multiple processing units (multiprocessor system)
Ø Execution of multiple instructions on multiple data

• Types of MIMD computer systems

- Shared memory multiprocessors

- Message-passing multicomputer (multicomputer system)

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 13 Pipelining

PIPELINING
• A technique of decomposing a sequential process into suboperations,
with each subprocess being executed in a special dedicated segment
that operates concurrently with all other segments.
Ai * B i + C i for i = 1, 2, 3, ... , 7
Ai Bi Memory Ci
Segment 1
R1 R2

Multiplier
Segment 2

R3 R4

Adder
Segment 3

Suboperations in each segment: R1  Ai, R2  Bi Load Ai and Bi

R3  R1 * R2, R4  Ci Multiply and load Ci
R5  R3 + R4 Add

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 14 Pipelining

OPERATIONS IN EACH PIPELINE STAGE

Clock Segment 1 Segment 2 Segment 3

Pulse
Number R1 R2 R3 R4 R5
1 A1 B1 --- --- -------
2 A2 B2 A1 * B1 C1 -------
3 A3 B3 A2 * B2 C2 A1 * B1 + C1
4 A4 B4 A3 * B3 C3 A2 * B2 + C2
5 A5 B5 A4 * B4 C4 A3 * B3 + C3
6 A6 B6 A5 * B5 C5 A4 * B4 + C4
7 A7 B7 A6 * B6 C6 A5 * B5 + C5
8 A7 * B7 C7 A6 * B6 + C6
9 A7 * B7 + C7

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 15 Pipelining

GENERAL PIPELINE
• General Structure of a 4-Segment Pipeline
Clock

Input S1 R1 S2 R2 S3 R3 S4 R4

• Space-Time Diagram
The following diagram shows 6 tasks T1 through T6 executed in 4
segments.
Clock cycles

1 2 3 4 5 6 7 8 9
1 T1 T2 T3 T4 T5 T6
No matter how many
segments, once the
Segment 2 T1 T2 T3 T4 T5 T6
pipeline is full, it takes only
3 T1 T2 T3 T4 T5 T6 one clock period to obtain
4 T1 T2 T3 T4 T5 T6 an output.

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 16 Pipelining

PIPELINE SPEEDUP
Consider the case where a k-segment pipeline used to execute n tasks.
Ø n = 6 in previous example
Ø k = 4 in previous example
• Pipelined Machine (k stages, n tasks)
ØThe n tasks clock cycles = k+(n-1) (9 in previous example)
• Conventional Machine (Non-Pipelined)
Ø Cycles to complete each task in nonpipeline =n
Ø For k tasks, nk cycles required is
• Speedup (S)
Ø S = Nonpipeline time /Pipeline time
Ø For n tasks: S = nk/(k+n-1)
Ø As n becomes much larger than k-1; Therefore, S = nk/n = k

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 17 Pipelining

PIPELINE AND MULTIPLE FUNCTION UNITS

Example:
- 4-stage pipeline
- 100 tasks to be executed in sequence
- 1 task in non-pipelined system; 4 clock cycles
Pipelined System : k + n - 1 = 4 + 99 = 103 clock cycles
Non-Pipelined System : n*k = 100 * 4 = 400 clock cycles
Speedup : Sk = 400 / 103 = 3.88

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 18

Types of Pipelining
• Arithmetic Pipeline
• Instruction Pipeline

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 19

Arithmetic Pipeline
• Pipe line arithmetic units are usually found in very high speed
computers.
• They are used to implement floating point operations (addition
and subtraction), multiplication of fixed point numbers.
• The inputs to the floating point adder pipeline are two
normalized floating point binary numbers.
• The floating point addition and subtraction can be performed in
four segments as shown in figure below.
• The registers labeled R are placed between the segments to
store intermediate results.

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 20

• The sub operations that are performed in the four segments are

• 1. Compare the exponents

• 2. Align the mantissas
• 3. Add or Subtract the mantissa
• 4. Normalize the result

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 21 Arithmetic Pipeline

ARITHMETIC PIPELINE
Floating-point adder Exponents
a b
Mantissas
A B
[1] Compare the exponents
[2] Align the mantissa R R

[3] Add/sub the mantissa

Compare
[4] Normalize the result Segment 1: exponents
Difference
by subtraction

X = A x 10a = 0.9504 x 103 R

Y = B x 10b = 0.8200 x 102
Segment 2: Choose exponent Align mantissa
1) Compare exponents :
3-2=1 R

2) Align mantissas
Add or subtract
X = 0.9504 x 103 Segment 3: mantissas
Y = 0.08200 x 103
3) Add mantissas R R

Z = 1.0324 x 103
Adjust Normalize
4) Normalize result Segment 4:
exponent result
Z = 0.10324 x 104
R R

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 22

• The comparator, shifter, adder-subtractor, incrementer, and

decrementer in the floating point pipeline are implemented with
combinational circuits.

• Let say individual segment delays= 60+70+100+80=310ns

• Register delay= 10 ns
• Non pipelined total delay= 320ns
• Pipelined adder=100 +10= 110ns
• Speed up= 320/110=2.9.

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 23 Instruction Pipeline

INSTRUCTION PIPE LINE

Pipeline processing can occur not only in the data stream but in the
instruction stream as well.

An instruction pipeline reads consecutive instructions from memory

while previous instructions are being executed in other segments.

Six Phases* in an Instruction Cycle

[1] Fetch an instruction from memory

[2] Decode the instruction
[3] Calculate the effective address of the operand
[4] Fetch the operands from memory
[5] Execute the operation
[6] Store the result in the proper place

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 24

• * Some instructions skip some phases

• * Effective address calculation can be done in the part of
the decoding phase
• * Storage of the operation result into a register is done
automatically in the execution phase

• ==> 4-Stage Pipeline

• [1] FI: Fetch an instruction from memory

• [2] DA: Decode the instruction and calculate the
effective address of the operand
• [3] FO: Fetch the operand
• [4] EX: Execute the operation
Computer Organization Computer Architectures Lab
Pipelining and Vector Processing 25 Instruction Pipeline

INSTRUCTION PIPELINE
Execution of Three Instructions in a 4-Stage Pipeline

Conventional

i FI DA FO EX

i+1 FI DA FO EX

i+2 FI DA FO EX

Pipelined

i FI DA FO EX
i+1 FI DA FO EX
i+2 FI DA FO EX

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 26 Instruction Pipeline

INSTRUCTION EXECUTION IN A 4-STAGE PIPELINE

Segment1: Fetch instruction

from memory

Decode instruction
Segment2: and calculate
effective address

Branch?
yes
no
Fetch operand
Segment3: from memory

Segment4: Execute instruction

Interrupt yes
Interrupt?
handling
no
Update PC

Empty pipe

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 27 Instruction Pipeline

INSTRUCTION EXECUTION IN A 4-STAGE PIPELINE

Step: 1 2 3 4 5 6 7 8 9 10 11 12 13

1 FI DA FO EX
Instruction

2 FI DA FO EX

(Branch) 3 FI DA FO EX

4 FI FI DA FO EX

5 FI DA FO EX

6 FI DA FO EX

7 FI DA FO EX

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 28

Pipeline Conflicts
– Pipeline Conflicts : 3 major difficulties
1) Resource conflicts: memory access by two segments at the
same time. Most of these conflicts can be resolved by using
separate instruction and data memories.

2) Data dependency: when an instruction depend on the result

of a previous instruction, but this result is not yet available.

3) Branch difficulties: branch and other instruction (interrupt,

ret, ..) that change the value of PC.

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 29 RISC Pipeline

RISC Computer
• RISC (Reduced Instruction Set Computer)
- Machine with a very fast clock cycle that executes at the rate of one
instruction per cycle.

• Major Characteristic
1. Relatively few instructions
2. Relatively few addressing modes
3. Memory access limited to load and store instructions
4. All operations done within the registers of the CPU
5. Fixed-length, easily decoded instruction format
6. Single-cycle instruction execution
7. Hardwired rather than microprogrammed control
8. Relatively large number of registers in the processor unit
9. Efficient instruction pipeline
10. Compiler support for efficient translation of high-level language
programs into machine language programs

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 30
RISC Pipeline
RISC PIPELINE

• The Instruction Cycle can be divided into three sub

operations and implemented in three segments( I,A,E).

The I- segment fetches the instruction from program memory.

The instruction is decoded and an ALU operation is performed in
the A segment.
E segment- Transfer the output of ALU to a register, memory,
or PC.

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 31

• Types of instructions

- Data Manipulation Instructions

- Load and Store Instructions

- Program Control Instructions

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 32

• 9-5 RISC Pipeline

– Example : Three-segment Instruction Pipeline
– Pipeline timing with data conflict :
– Pipeline timing with delayed load :

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 33

Computer Organization Computer Architectures Lab

Pipelining and Vector Processing 34

• In figure (a), There will be a data conflict in instruction 3

because the operand in R2 is not yet available in the A
segment.
• 1.LOAD: R1 M[address 1]
• 2. LOAD : R2 M[address 2]
• 3. ADD: R3  R1+R2
• 4. STORE: M[address 3]  R3
• Solution: Delayed load:

Computer Organization Computer Architectures Lab

PDF No Bake Asweseeit - Compress
No ratings yet
PDF No Bake Asweseeit - Compress
132 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
39 pages
Advanced Computer Architectures
83% (12)
Advanced Computer Architectures
37 pages
Pipelining
No ratings yet
Pipelining
33 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
37 pages
Pipelining Vector Processing
No ratings yet
Pipelining Vector Processing
27 pages
Chapter 3 - Pipelining-And-Vector-Processing
100% (1)
Chapter 3 - Pipelining-And-Vector-Processing
29 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
37 pages
Advanced Computer Architectures
100% (6)
Advanced Computer Architectures
29 pages
Pipelining and Vector Processing: - Parallel
No ratings yet
Pipelining and Vector Processing: - Parallel
37 pages
Ca Unit 2.2
100% (2)
Ca Unit 2.2
22 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
37 pages
Pipelining 2
No ratings yet
Pipelining 2
43 pages
Unit 6 COA
No ratings yet
Unit 6 COA
37 pages
Pipeline and Vector Processing
No ratings yet
Pipeline and Vector Processing
18 pages
Coa Unit-3 Part-2
No ratings yet
Coa Unit-3 Part-2
35 pages
Unit-6 Pipelining
No ratings yet
Unit-6 Pipelining
63 pages
Pipelining & Vector Processing Guide
No ratings yet
Pipelining & Vector Processing Guide
28 pages
Chap. 9 Pipeline and Vector Processing
0% (1)
Chap. 9 Pipeline and Vector Processing
12 pages
Presentation 5156 Content Document 20250301102853AM
No ratings yet
Presentation 5156 Content Document 20250301102853AM
40 pages
Csso U 5
No ratings yet
Csso U 5
29 pages
3rd Unit
No ratings yet
3rd Unit
72 pages
Parallel Processing Explained
No ratings yet
Parallel Processing Explained
33 pages
Pipeline and Vector Processing
No ratings yet
Pipeline and Vector Processing
52 pages
Coa, Unit V, Notes
No ratings yet
Coa, Unit V, Notes
26 pages
Pipelining and Vector Processing Guide
No ratings yet
Pipelining and Vector Processing Guide
63 pages
Chapter9pipelining 200907163859
No ratings yet
Chapter9pipelining 200907163859
13 pages
Chapter 5 Pipelining and Vector Processing Modified
No ratings yet
Chapter 5 Pipelining and Vector Processing Modified
37 pages
CO Module 5 Notes
No ratings yet
CO Module 5 Notes
16 pages
32 Hazards in Pipeline 06-04-2023
No ratings yet
32 Hazards in Pipeline 06-04-2023
24 pages
Lecture 8 Unit 4 Pipeline and Vector Processing 2019
No ratings yet
Lecture 8 Unit 4 Pipeline and Vector Processing 2019
36 pages
Chap. 9 Pipeline and Vector Processing
No ratings yet
Chap. 9 Pipeline and Vector Processing
16 pages
Pipeline and Vector Processing
100% (1)
Pipeline and Vector Processing
18 pages
Coa Module 5
No ratings yet
Coa Module 5
10 pages
Pipelining & Vector Processing Guide
No ratings yet
Pipelining & Vector Processing Guide
29 pages
Parallel Processing Essentials
No ratings yet
Parallel Processing Essentials
32 pages
Unit 5-2 COA
No ratings yet
Unit 5-2 COA
52 pages
Cao Unit 6
No ratings yet
Cao Unit 6
21 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
30 pages
Chapter 9
No ratings yet
Chapter 9
28 pages
Unit-5-Parallel Processing
No ratings yet
Unit-5-Parallel Processing
11 pages
CO UNIT - 5 - Parallel Processing
No ratings yet
CO UNIT - 5 - Parallel Processing
30 pages
Pipeline - 3117
No ratings yet
Pipeline - 3117
22 pages
COA DR MVN 5 UNIT - Latest PDF
No ratings yet
COA DR MVN 5 UNIT - Latest PDF
24 pages
Parallel Processing & Pipelining Guide
No ratings yet
Parallel Processing & Pipelining Guide
8 pages
Parallel Processing: Execution of Concurrent Events in The Computing Process To Achieve Faster Computational Speed
No ratings yet
Parallel Processing: Execution of Concurrent Events in The Computing Process To Achieve Faster Computational Speed
10 pages
Coa Mod 4 5
No ratings yet
Coa Mod 4 5
91 pages
Pipeline - 3117
No ratings yet
Pipeline - 3117
21 pages
Computer Systems Architecture 308 312
No ratings yet
Computer Systems Architecture 308 312
5 pages
Pipeline Processing Coa
No ratings yet
Pipeline Processing Coa
34 pages
Arithmatic Pipline Unit-3
No ratings yet
Arithmatic Pipline Unit-3
27 pages
2 Pipelining-Amp-Vector-Processing
No ratings yet
2 Pipelining-Amp-Vector-Processing
37 pages
Advanced Computer Architecture
No ratings yet
Advanced Computer Architecture
33 pages
Arithmatic Pipline Unit-3
No ratings yet
Arithmatic Pipline Unit-3
27 pages
Unit-4: Pipelining and Computer Arithmetic
No ratings yet
Unit-4: Pipelining and Computer Arithmetic
40 pages
Caalp Unit5
No ratings yet
Caalp Unit5
20 pages
05 - 01 Pipeline and Vector Processing
No ratings yet
05 - 01 Pipeline and Vector Processing
14 pages
Parallel Processing and Pipelining
No ratings yet
Parallel Processing and Pipelining
53 pages
Unit-5 Computer Organization Notes
No ratings yet
Unit-5 Computer Organization Notes
16 pages
Energy Auditor Exam Guide
No ratings yet
Energy Auditor Exam Guide
22 pages
An Optimized Grounded Base Oscillator Design For VHF/UHF
No ratings yet
An Optimized Grounded Base Oscillator Design For VHF/UHF
12 pages
Diffuse Double Layer
No ratings yet
Diffuse Double Layer
14 pages
Digital Innovations Exam UiTM
No ratings yet
Digital Innovations Exam UiTM
6 pages
安川ES165N en 40
No ratings yet
安川ES165N en 40
4 pages
Quaternion Cheat Sheet and Problems Quaternion Arithmetic: 0 X y Z I 0 X y Z
No ratings yet
Quaternion Cheat Sheet and Problems Quaternion Arithmetic: 0 X y Z I 0 X y Z
2 pages
MCQ Pediatrics
No ratings yet
MCQ Pediatrics
5 pages
Spare Parts Book SK550 1.1
No ratings yet
Spare Parts Book SK550 1.1
26 pages
Zishan Z3 User Manual
No ratings yet
Zishan Z3 User Manual
3 pages
Monograph (Cha0406) MULA - Dead Leaves Fall (Oef)
No ratings yet
Monograph (Cha0406) MULA - Dead Leaves Fall (Oef)
135 pages
Philmetals 2014 - Rev - Reduced PDF
No ratings yet
Philmetals 2014 - Rev - Reduced PDF
82 pages
1-6 Practice
No ratings yet
1-6 Practice
2 pages
A For: Homework #2 Solution
No ratings yet
A For: Homework #2 Solution
3 pages
ICSE VII Maths Ratio and Proportion
67% (3)
ICSE VII Maths Ratio and Proportion
12 pages
Final Paper - Tales Takezo
No ratings yet
Final Paper - Tales Takezo
8 pages
Chapter 4 Notes Class 12
100% (1)
Chapter 4 Notes Class 12
21 pages
RSV4 Factory APRC - SM - 2010-11 - GB - 898952
No ratings yet
RSV4 Factory APRC - SM - 2010-11 - GB - 898952
504 pages
Moral Panics Assignment
No ratings yet
Moral Panics Assignment
7 pages
Beltscale Handbook 03 12 TL
No ratings yet
Beltscale Handbook 03 12 TL
8 pages
Realitive and Absolute Dating
No ratings yet
Realitive and Absolute Dating
24 pages
Milling Parameters Guide
No ratings yet
Milling Parameters Guide
1 page
The Effect of Macrocelebrity and Microin Uencer Endorsements On Consumer-Brand Engagement in Instagram
No ratings yet
The Effect of Macrocelebrity and Microin Uencer Endorsements On Consumer-Brand Engagement in Instagram
21 pages
83 Revision Questions For IGCSE Questions Solutions PDF
100% (4)
83 Revision Questions For IGCSE Questions Solutions PDF
5 pages
Preparation of Fermented Blue Crab With Rice and It'S Market Ability
No ratings yet
Preparation of Fermented Blue Crab With Rice and It'S Market Ability
6 pages
Socket Base Connections With Precast Concrete Columns PDF
100% (5)
Socket Base Connections With Precast Concrete Columns PDF
11 pages
JVC XL-PG38SL XL-PG39SL XL-PV390SL Portable CD Player Instruction Manual EN
No ratings yet
JVC XL-PG38SL XL-PG39SL XL-PV390SL Portable CD Player Instruction Manual EN
4 pages
SQL Commands
No ratings yet
SQL Commands
21 pages
Irc 096-1987
No ratings yet
Irc 096-1987
9 pages
Lab Ex1
100% (1)
Lab Ex1
7 pages