0% found this document useful (0 votes)

3 views35 pages

Parallel Computing

Uploaded by

f20231028

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views35 pages

Parallel Computing

Uploaded by

f20231028

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

Parallel Computing

Dr. Gargi Sanket Prabhu

Parallel Architectures CS & IS, BITS Pilani K K Birla Goa
Campus
Amdahl’s Law
• Let f be the fraction of operations in a computation that must be performed
sequentially 0 ≤ f ≤ 1

Maximum speedup

S ≤ ____1____
f+(1-f) / p
Example
• If 80% of a program can be parallelized, then the theoretical maximum
speedup when number of processors are 5 is ___________
Amdahl’s Law
Amdahl’s Effect
• As the problem size increases, the fraction f of inherently sequential
operations decreases, making the problem more amenable to
parallelization.
Amdahl’s Law example
N=1,000,000
Sequential algorithm marks 2,122,048 cells
Outputs 78,498 prime numbers

Solution:
Amdahl’s Law example
• N=1,000,000
• Sequential algorithm marks 2,122,048 cells
• Outputs 78,498 prime numbers

• Solution:
Limitations of Amdahl’s Law
Limitations of Amdahl’s Law
• Fixed Problem Size Assumption
• Ignores Communication & Synchronization Overheads
• Sequential Fraction Is Not Always Constant
• Homogeneous Processor Assumption
• Overly Pessimistic for Large Problems
Parallel Architecture Paradigms
Michael Flynn in 1972 gave taxonomy for categorizing different styles of
computer system architecture:
• Single instruction stream, single data stream (SISD)
• Single instruction stream, multiple data stream (SIMD)
• Multiple instruction stream, single data stream (MISD)
• Multiple instruction stream, multiple data stream (MIMD)
SISD

Single processor executes a single instruction stream on a single data stream

E.g. Classic Von Neumann architecture

SIMD : Vector Computing

A single instruction is broadcast to multiple processing units,

each of which operates on a separate data stream
MISD

Multiple processors executing different instructions on the same

data stream
MIMD: Most Advanced Computers

Multiple processors or cores, each capable of executing different

instructions on different data streams independently
Memory Access Classification

An alternative way to classify parallel systems is

01 by how their cores access memory

02 This classification focuses on memory-sharing

and communication between cores
Shared Memory Systems

• Cores share access to a common memory space

• Cores coordinate their tasks by modifying shared memory locations
Shared Memory Computing
Advantages:
• Easier to program due to shared memory access
• Well-suited for symmetric multiprocessing (SMP) systems
• Efficient for tasks that require frequent communication and coordination
between threads or processes.
Shared Memory Computing
Advantages:
• Easier to program due to shared memory access
• Well-suited for symmetric multiprocessing (SMP) systems
• Efficient for tasks that require frequent communication and coordination
between threads or processes.

Challenges:
• Scalability can be limited due to memory contention
• Careful synchronization mechanisms are required to prevent race conditions.
Example

Fragment 0 Fragment 1

While(x==0);
x=1; x=2;
Example

Void withdraw(int
amount)
{
If(balance-amount >0)
balance-=amount;
}
Example

Atomic :
Location x=1;

Location y=5;
Distributed Memory Systems
• Each core has its own private memory
• Cores coordinate their tasks by communicating across a network
Example

Fragment 0 Fragment 1

X=5;
Receive(1,y); Send(0,10);
Send(1,x+y); Receive(0,x);
Distributed Memory Computing
Advantages:
• Scalable for large systems as memory is distributed
• Suitable for parallel applications with minimal communication between
processes
• Harness the power of a large number of nodes.

Challenges:
• Programming can be more complex due to explicit message passing data
distribution, and synchronization requirements.
Distributed Memory Computing
Advantages:
• Scalable for large systems as memory is distributed
• Suitable for parallel applications with minimal communication between
processes
• Harness the power of a large number of nodes.
Hybrid Systems

Combine shared-memory nodes with distributed-memory architectures

Common in clusters, where individual nodes are multicore shared-

memory systems connected via a network
Memory Access Models
• Uniform Memory Access (UMA)

• Non Uniform Memory Access (NUMA)

• Cache-Only Memory Access (COMA)

Uniform Memory Access
• All processors have equal and uniform access to a single shared
memory pool.

• Memory access times are roughly the same for all processors

• The memory access pattern is similar to that of a single-processor system,

which makes programming simpler.
Uniform Memory Access

• As the number of processors increases, contention for the shared

memory bus can lead to performance bottlenecks.

• UMA is more suited for smaller multiprocessor systems.

Non Uniform Memory Access
• The system is composed of multiple nodes, each containing processors
and a local memory.

• Processors have faster access to their local memory than to remote

memory in other nodes.

• Memory access times can vary based on whether the data is stored locally
or remotely.
Non Uniform Memory Access
• Scale better as the number of processors increases since each node can
have its own local memory and processors.

• Used in large-scale multiprocessor systems, like servers and high-

performance computing clusters.
Cache-Only Memory Access

• Only cache memories are present; no main memory is employed either

in the form of UMA or NUMA

• The main goal is to provide a unified view of the memory while efficiently
distributing data across caches.
Cache-Only Memory Access

• COMA architectures have not gained as much traction as UMA and

NUMA due to their complexity and limited benefits compared to other
memory models.
Summary
Feature COMA UMA NUMA
Non-uniform,
Memory Access Dynamic and
Uniform depending on
Time variable
location
Complexity High Low Medium

Limited by Limited by memory

Scalability High
complexity bus contention

Potentially high for Predictable and High with proper

Performance
specific apps uniform optimizations
Support Limited Extensive Extensive
Large-scale
Specialized, Small to medium
Typical Use Cases multiprocessor
research systems multiprocessor
systems
Thank You!

PDC Notes by Zatch-1
No ratings yet
PDC Notes by Zatch-1
42 pages
Multiprocessor vs Multicomputer Systems
No ratings yet
Multiprocessor vs Multicomputer Systems
27 pages
Lecture-13-14 Parallel and Distributed Systems Programming Models-Jameel
No ratings yet
Lecture-13-14 Parallel and Distributed Systems Programming Models-Jameel
70 pages
CS82 Advanced Computer Architecture: Parallel Computer Models 1.2 Multiprocessors and Multicomputers
No ratings yet
CS82 Advanced Computer Architecture: Parallel Computer Models 1.2 Multiprocessors and Multicomputers
19 pages
Module 2
No ratings yet
Module 2
124 pages
Parallel Computing
No ratings yet
Parallel Computing
32 pages
SISd
No ratings yet
SISd
17 pages
Pipeliningandvectorprocessing 140612142847 Phpapp01
No ratings yet
Pipeliningandvectorprocessing 140612142847 Phpapp01
53 pages
Quiz Prep
No ratings yet
Quiz Prep
21 pages
Distributed Shared Memory Basics
No ratings yet
Distributed Shared Memory Basics
36 pages
CSCI 8150 Advanced Computer Architecture
100% (2)
CSCI 8150 Advanced Computer Architecture
18 pages
Ca - Unit 4
No ratings yet
Ca - Unit 4
77 pages
Unit Iv Parallelism
No ratings yet
Unit Iv Parallelism
80 pages
HPA - Notes
No ratings yet
HPA - Notes
5 pages
U1-Theory of Parallelism
No ratings yet
U1-Theory of Parallelism
43 pages
Ceg4131 Models
No ratings yet
Ceg4131 Models
27 pages
Slides Taken From: Parallel Computing Platforms
No ratings yet
Slides Taken From: Parallel Computing Platforms
11 pages
Quiz Prep
No ratings yet
Quiz Prep
21 pages
Unit 1 - Part - 2
No ratings yet
Unit 1 - Part - 2
30 pages
Shared Memory. Distributed Memory. Hybrid Distributed-Shared Memory
No ratings yet
Shared Memory. Distributed Memory. Hybrid Distributed-Shared Memory
22 pages
Multi Processors and Thread Level Parallelism
No ratings yet
Multi Processors and Thread Level Parallelism
74 pages
Introduction To Parallel Programming: Linda Woodard CAC 19 May 2010
100% (1)
Introduction To Parallel Programming: Linda Woodard CAC 19 May 2010
38 pages
UNIT 4 COA Parallelism
No ratings yet
UNIT 4 COA Parallelism
29 pages
Unit 1
No ratings yet
Unit 1
21 pages
Flynn's Taxonomy & Parallel Models
No ratings yet
Flynn's Taxonomy & Parallel Models
27 pages
Advanced Computer Architecture Unit 1
No ratings yet
Advanced Computer Architecture Unit 1
23 pages
Parallel Computing: Er. Anupama Singh Department of Computer Science & Engg
No ratings yet
Parallel Computing: Er. Anupama Singh Department of Computer Science & Engg
22 pages
Unit 4
No ratings yet
Unit 4
16 pages
02 Lecture Flynn IN
No ratings yet
02 Lecture Flynn IN
78 pages
Parallel Computers
No ratings yet
Parallel Computers
39 pages
5 4 Parallel
No ratings yet
5 4 Parallel
47 pages
Coa PPT-2
No ratings yet
Coa PPT-2
16 pages
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
No ratings yet
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
91 pages
Organization of Multiprocessor Systems
No ratings yet
Organization of Multiprocessor Systems
87 pages
CC Unit 1.2
No ratings yet
CC Unit 1.2
39 pages
Unit 3
No ratings yet
Unit 3
28 pages
CSA Presentation
No ratings yet
CSA Presentation
37 pages
Ch12 Parallel Proc3-Aula
No ratings yet
Ch12 Parallel Proc3-Aula
35 pages
Lecture4 (Share Memory-"According Access")
No ratings yet
Lecture4 (Share Memory-"According Access")
16 pages
Parallel Computer Models: PCA Chapter 1
No ratings yet
Parallel Computer Models: PCA Chapter 1
61 pages
Shared Memory Archeitecure Easy
No ratings yet
Shared Memory Archeitecure Easy
3 pages
Parallel Computing Concepts Guide
No ratings yet
Parallel Computing Concepts Guide
32 pages
Parallel Programming Course Overview
No ratings yet
Parallel Programming Course Overview
36 pages
Unit Iii
No ratings yet
Unit Iii
10 pages
Module 4 - Architecture
No ratings yet
Module 4 - Architecture
22 pages
Parallel VS Distributed Computing
No ratings yet
Parallel VS Distributed Computing
9 pages
Background: Computer System Architectures Computer System Software
No ratings yet
Background: Computer System Architectures Computer System Software
25 pages
Parallel Processing
No ratings yet
Parallel Processing
35 pages
Multicore Architecture Basics
No ratings yet
Multicore Architecture Basics
19 pages
Multiprocessor Basics & Performance
No ratings yet
Multiprocessor Basics & Performance
52 pages
CS213 Parallel Processing Syllabus
No ratings yet
CS213 Parallel Processing Syllabus
26 pages
ACA UNIT-5 Notes
No ratings yet
ACA UNIT-5 Notes
15 pages
17 Computer Architecture and Organization
No ratings yet
17 Computer Architecture and Organization
28 pages
COE4590 8 Multiprocessor
No ratings yet
COE4590 8 Multiprocessor
17 pages
Multi
No ratings yet
Multi
5 pages
Unit 7 - Parallel Processing Paradigm
No ratings yet
Unit 7 - Parallel Processing Paradigm
26 pages
Cloud Computing - Lecture 3
No ratings yet
Cloud Computing - Lecture 3
22 pages
Parallel Distributed Computing
No ratings yet
Parallel Distributed Computing
64 pages
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
No ratings yet
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
26 pages
Class X Computer Questions
0% (2)
Class X Computer Questions
29 pages
ArcLinkXT 01
No ratings yet
ArcLinkXT 01
2 pages
Rofi Operation and Maintenance Manual
No ratings yet
Rofi Operation and Maintenance Manual
3 pages
Air Circuit Breakers - AH Type
No ratings yet
Air Circuit Breakers - AH Type
12 pages
Daz Studio Tutorials
No ratings yet
Daz Studio Tutorials
5 pages
Smcs Questions
No ratings yet
Smcs Questions
5 pages
Kenwood Kvt-819 - 829 - 839 Monitor With DVD Receiver
No ratings yet
Kenwood Kvt-819 - 829 - 839 Monitor With DVD Receiver
80 pages
Comprehensive List of Logic ICs
No ratings yet
Comprehensive List of Logic ICs
16 pages
Single-Pane-Of-Glass Cloud Console: Providing Centralized Visibility, Management and Insight Across All Oses
No ratings yet
Single-Pane-Of-Glass Cloud Console: Providing Centralized Visibility, Management and Insight Across All Oses
20 pages
Job Description of Instrumentation Technician
100% (2)
Job Description of Instrumentation Technician
3 pages
SECTION 22 61 19 Compressed-Air Equipment For Laboratory and Healthcare Facilities
No ratings yet
SECTION 22 61 19 Compressed-Air Equipment For Laboratory and Healthcare Facilities
7 pages
FIT Unit 1 Part A
No ratings yet
FIT Unit 1 Part A
16 pages
Guide To Installation And Use Of Keil Μvision2 Software
No ratings yet
Guide To Installation And Use Of Keil Μvision2 Software
40 pages
Hydraulic Dynamometer DT
100% (2)
Hydraulic Dynamometer DT
46 pages
Data Sheet: 4-Digit LED-driver With I C-Bus Interface
No ratings yet
Data Sheet: 4-Digit LED-driver With I C-Bus Interface
19 pages
Schematic MTK8223L PDF
60% (5)
Schematic MTK8223L PDF
9 pages
Getting Started With Arduino and Go by Agus Kurniawan
No ratings yet
Getting Started With Arduino and Go by Agus Kurniawan
79 pages
Free Quality Management System Template
No ratings yet
Free Quality Management System Template
8 pages
Young Adults: Economics & Justice
No ratings yet
Young Adults: Economics & Justice
289 pages
Smart Phone
No ratings yet
Smart Phone
3 pages
Case Loader Backhoe Parts List
No ratings yet
Case Loader Backhoe Parts List
3 pages
A240-A240X Solenoid Valves
No ratings yet
A240-A240X Solenoid Valves
1 page
PIC18 Block Diagram
No ratings yet
PIC18 Block Diagram
3 pages
Create Key Pair Steps
No ratings yet
Create Key Pair Steps
8 pages
Asus Laptop Technical Datasheet
No ratings yet
Asus Laptop Technical Datasheet
1 page
Barcode 128
No ratings yet
Barcode 128
17 pages
IoT Device Pentest by Shubham Chougule
No ratings yet
IoT Device Pentest by Shubham Chougule
24 pages
IO-Link Module BNI IOL-104-S02-R012 Specs
No ratings yet
IO-Link Module BNI IOL-104-S02-R012 Specs
2 pages
Cisco Unified IP Phone 6921, 6941, and 6961 User Guide For Cisco Unified Communications Manager 8.0 (SCCP)
No ratings yet
Cisco Unified IP Phone 6921, 6941, and 6961 User Guide For Cisco Unified Communications Manager 8.0 (SCCP)
120 pages
LG LCD CH La73a 47lc7df Ub
No ratings yet
LG LCD CH La73a 47lc7df Ub
62 pages

Parallel Computing

Uploaded by

Parallel Computing

Uploaded by

Parallel Computing

Dr. Gargi Sanket Prabhu

Single processor executes a single instruction stream on a single data stream

E.g. Classic Von Neumann architecture

A single instruction is broadcast to multiple processing units,

Multiple processors executing different instructions on the same

Multiple processors or cores, each capable of executing different

An alternative way to classify parallel systems is

02 This classification focuses on memory-sharing

• Cores share access to a common memory space

Combine shared-memory nodes with distributed-memory architectures

Common in clusters, where individual nodes are multicore shared-

• Non Uniform Memory Access (NUMA)

• Cache-Only Memory Access (COMA)

• The memory access pattern is similar to that of a single-processor system,

• As the number of processors increases, contention for the shared

• UMA is more suited for smaller multiprocessor systems.

• Processors have faster access to their local memory than to remote

• Used in large-scale multiprocessor systems, like servers and high-

• Only cache memories are present; no main memory is employed either

• COMA architectures have not gained as much traction as UMA and

Limited by Limited by memory

Potentially high for Predictable and High with proper

You might also like