0% found this document useful (0 votes)

31 views20 pages

PARALLEL PROGRAMMING Module 1

The document outlines a course on parallel programming, covering objectives such as exploring the need for parallel programming and demonstrating the application of MPI and OpenMP libraries. It includes an introduction to parallel programming concepts, classifications of parallel computers (SIMD and MIMD systems), and details on shared-memory and distributed-memory systems. Additionally, it discusses interconnection networks and their impact on system performance.

Uploaded by

shantesh351

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views20 pages

PARALLEL PROGRAMMING Module 1

Uploaded by

shantesh351

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

PARALLEL

PROGRAMMING
Course Code: BDS701

• Course objectives:

• Explore the need for parallel programming

• Explain how to parallelize on MIMD systems

• To demonstrate how to apply MPI library and parallelize the suitable programs

• To demonstrate how to apply OpenMP pragma and directives to parallelize the

suitable programs

• To demonstrate how to design CUDA program

Module 1

Introduction to parallel programming, Parallel hardware and parallel

software – Classifications of parallel computers, SIMD systems, MIMD
systems, Interconnection networks, Cache coherence, Shared-memory vs.
distributed-memory, Coordinating the processes/threads, Shared-memory,
Distributed-memory.
Introduction to parallel programming
• Parallel programming is the execution of multiple instructions
simultaneously to solve a problem faster.

• It increases computational speed and efficiency by utilizing multiple

processing units.

• Important in modern computing due to multi-core processors and high-

performance computing needs.
Parallel hardware and Parallel
Software
1 Parallel hardware refers to computer systems with multiple processing
units that work together to perform tasks simultaneously.

 It enables faster execution of complex computations by distributing the

workload across processors.

2 Parallel software is designed to execute multiple tasks simultaneously

using multiple processors or cores.

 It complements parallel hardware to achieve high performance, scalability,

and efficient resource use.

 Requires special programming models, libraries, and design techniques.

Classification of Parallel Computers
Parallel computers can be classified in two simple ways:

1. Flynn’s Taxonomy
- Classifies systems based on instruction and data streams.
- Two common types:
• SIMD: Single instruction, multiple data.
• MIMD: Multiple instructions, multiple data.
- Helps understand how tasks are handled in parallel.

2. Memory Access Model

Classifies based on how memory is accessed by processors.
- Two main types:
• Shared Memory: All cores use the same memory.
• Distributed Memory: Each core has its own memory.
- Communication between cores differs in each type.
SIMD SYSTEM
Single Instruction, Multiple Data (SIMD) systems are a type of parallel system.

1. SIMD systems apply the same instruction simultaneously to multiple data streams.

2. Conceptually, a SIMD system has one control unit and multiple datapaths.

3. The control unit broadcasts an instruction to all datapaths, each of which either executes
the instruction on its data or remains idle.

4. For example, in vector addition, SIMD can add elements of two arrays, x and y, element-
wise in parallel.

Consider the loop: for (i = 0; i < n; i++) x[i] += y[i];

5. If the SIMD system has n datapaths, each datapath i can load x[i] and y[i],
perform the addition x[i] += y[i], and store the result back in x[i].

6. If the system has m datapaths where m < n, the additions are executed in blocks of
m elements at a time.
For example, if m = 4 and n = 15, the system processes elements in groups: 0–3, 4–7,
8–11, and 12–14.

7. In the last group (elements 12–14), only three data paths are used, so one data path
remains idle. The requirement that all data paths execute the same instruction or stay
idle can reduce SIMD performance.

For instance, if we want to add only when y[i] is positive: for (i = 0; i < n; i++)
if (y[i] > 0.0) x[i] += y[i];
some data paths may be idle depending on the condition, leading to inefficiency.
MIMD SYSTEMS

MIMD (Multiple Instruction, Multiple Data) systems run multiple

instruction streams on multiple data streams.

• Each processor/core operates independently with its own control unit

and datapath.

• Processors are asynchronous and operate at their own pace.

• Useful for task parallelism and complex computing systems

Types of MIMD Systems
1 Shared-memory systems:
- Processors access a common memory.
- Implicit communication using shared data.

2 Distributed-memory systems:
- Each processor has private memory.
- Communicate via messages/functions.
SHARED-MEMORY SYSTEM
Shared-memory systems

 The most widely available shared-memory systems use one or more multicore
processors.
 There is one large memory unit all cpu can access.
 Processor are connected to memory via an interconnect(like bus/network switch)
 All processor share the same address space ,meaning any cpu can access any
memory location directly
Uniform Memory Access (UMA)
In systems where all cores access memory with equal latency, the memory
access time remains uniform regardless of which core accesses which
memory location
Non-Uniform Memory Access (NUMA)
when each core has faster access to its own local memory block and slower
access to other cores' memory, the system is referred to as a Non-Uniform
Memory Access (NUMA) system.
DISTRIBUTED-MEMORY SYSTEM
DISTRIBUTED-MEMORY SYSTEM

• Each cpu has its own private memory

• processor cannot access each others memory directly
• To share dataprocessor send messages through an interconnect(network)
• Most common type today: clusters.
• Built from multiple standard computers connected via networks (e.g.,
Ethernet).
• Each cluster node is usually a shared-memory system with multicore
processors.
• These are called hybrid systems (shared memory within nodes, distributed
memory between nodes).
• Grids connect computers over large distances (geographically dispersed).
• Grids can use different hardware across nodes and act as one distributed-
memory system.
INTERCONNECTION NETWORKS

• Interconnect links processors and memory.

• Speed of interconnect greatly affects system performance.
• A slow interconnect can bottleneck parallel programs.
• Shared- and distributed-memory systems use different interconnect types.
Shared-Memory Interconnects

•Earlier systems used a bus to connect processors and memory.

•Bus is simple and cost-effective, but all devices share the same lines.
•More processors → contention increases → performance drops.
•Modern systems use switched interconnects instead of buses.

Switched Interconnects (Crossbar)

•Use switches for efficient, organized communication.
•Crossbar connects processors and memory modules via bidirectional links.
•Switches can be configured for different data paths.
•If memory modules ≥ processors, conflicts only occur when two processors
access the same module.
•Allow simultaneous communication between multiple devices.
•Faster than buses, but more expensive due to cost of switches and links.
Shared-Memory Interconnects

(a): A crossbar switch connecting four processors (Pi) and four memory modules (Mj)

(b): Configuration of internal switches in a crossbar

(c): Simultaneous memory accesses by the processors.

Parallel Computing IA1
No ratings yet
Parallel Computing IA1
29 pages
BCS702 Module1 Detailed Notes
No ratings yet
BCS702 Module1 Detailed Notes
14 pages
Module1 PP BDS701 Notes
No ratings yet
Module1 PP BDS701 Notes
27 pages
PC Module1
No ratings yet
PC Module1
28 pages
Flynn's Classification
No ratings yet
Flynn's Classification
4 pages
Computer Architecture Flynn's Taxonomy
No ratings yet
Computer Architecture Flynn's Taxonomy
4 pages
Lecture 3
No ratings yet
Lecture 3
49 pages
Module 2
No ratings yet
Module 2
124 pages
Lecture 3 - 1 Dichotomy of Parallel Computing Platforms
No ratings yet
Lecture 3 - 1 Dichotomy of Parallel Computing Platforms
17 pages
Flynn's Taxonomy and SISD SIMD MISD MIMD
86% (14)
Flynn's Taxonomy and SISD SIMD MISD MIMD
7 pages
L32 SMP
No ratings yet
L32 SMP
47 pages
Parallel Computing: Er. Anupama Singh Department of Computer Science & Engg
No ratings yet
Parallel Computing: Er. Anupama Singh Department of Computer Science & Engg
22 pages
Chapter - 5 Parallel Processing
No ratings yet
Chapter - 5 Parallel Processing
117 pages
Unit 1
No ratings yet
Unit 1
21 pages
Cloud Computing Lecture3
No ratings yet
Cloud Computing Lecture3
50 pages
CC Unit 1.2
No ratings yet
CC Unit 1.2
39 pages
Unit Iv Parallelism
No ratings yet
Unit Iv Parallelism
80 pages
U1-Theory of Parallelism
No ratings yet
U1-Theory of Parallelism
43 pages
Pda 2
No ratings yet
Pda 2
105 pages
02 Lecture Flynn IN
No ratings yet
02 Lecture Flynn IN
78 pages
HPA - Notes
No ratings yet
HPA - Notes
5 pages
Chapter 6 Parallel and Concurrent Computing
No ratings yet
Chapter 6 Parallel and Concurrent Computing
27 pages
Chapter2 Part 3
No ratings yet
Chapter2 Part 3
27 pages
Unit 4
No ratings yet
Unit 4
16 pages
Parallel Computing Pastpaper Solve by Noman Tariq
No ratings yet
Parallel Computing Pastpaper Solve by Noman Tariq
30 pages
CP4253 Map Unit I
No ratings yet
CP4253 Map Unit I
31 pages
Architecture
No ratings yet
Architecture
67 pages
Parallel Processing Essentials
No ratings yet
Parallel Processing Essentials
49 pages
Parallel Processing Explained
No ratings yet
Parallel Processing Explained
22 pages
Multiprocessor Basics & Performance
No ratings yet
Multiprocessor Basics & Performance
52 pages
Parallel VS Distributed Computing
No ratings yet
Parallel VS Distributed Computing
9 pages
Unit 1 - Part - 2
No ratings yet
Unit 1 - Part - 2
30 pages
Cs8083 MCP Unit I Notes
No ratings yet
Cs8083 MCP Unit I Notes
31 pages
CA Chap7 Multicores Multiprocessors
No ratings yet
CA Chap7 Multicores Multiprocessors
42 pages
NOTES
No ratings yet
NOTES
19 pages
Parallel Computing
No ratings yet
Parallel Computing
32 pages
Unit 7 - Parallel Processing Paradigm
No ratings yet
Unit 7 - Parallel Processing Paradigm
26 pages
1/1 Multiprocessors (Or) Shared Memory Multi-Processor Model
No ratings yet
1/1 Multiprocessors (Or) Shared Memory Multi-Processor Model
17 pages
COA U5 PPT Full
No ratings yet
COA U5 PPT Full
43 pages
Huawei Cloud Exam Prep
No ratings yet
Huawei Cloud Exam Prep
6 pages
Explicitly Parallel Platforms
No ratings yet
Explicitly Parallel Platforms
90 pages
Introduction To Parallel Programming
No ratings yet
Introduction To Parallel Programming
53 pages
William Stallings Computer Organization and Architecture: Parallel Processing
No ratings yet
William Stallings Computer Organization and Architecture: Parallel Processing
40 pages
(SMC), (SMP), (MPP) : Symmetric Multi-Computers Symmetric Multi-Processors
No ratings yet
(SMC), (SMP), (MPP) : Symmetric Multi-Computers Symmetric Multi-Processors
13 pages
CS802A Lec-2 PDF
No ratings yet
CS802A Lec-2 PDF
28 pages
Parallel Processors: Session 2
No ratings yet
Parallel Processors: Session 2
32 pages
Parallel Computing Unit 2 - Parallel Computing Architecture
No ratings yet
Parallel Computing Unit 2 - Parallel Computing Architecture
49 pages
Parallel Computer Models: PCA Chapter 1
No ratings yet
Parallel Computer Models: PCA Chapter 1
61 pages
Baker CHPT 5 SIMD Good
No ratings yet
Baker CHPT 5 SIMD Good
94 pages
Ds Assignment Solved
No ratings yet
Ds Assignment Solved
6 pages
CS82 Advanced Computer Architecture: Parallel Computer Models 1.2 Multiprocessors and Multicomputers
No ratings yet
CS82 Advanced Computer Architecture: Parallel Computer Models 1.2 Multiprocessors and Multicomputers
19 pages
Slides Taken From: Parallel Computing Platforms
No ratings yet
Slides Taken From: Parallel Computing Platforms
11 pages
Programação Paralela e Distribuída
No ratings yet
Programação Paralela e Distribuída
39 pages
Part 1 - Lecture 2 - Parallel Hardware
No ratings yet
Part 1 - Lecture 2 - Parallel Hardware
60 pages
DSECL ZG 522: Big Data Systems: Session 2: Parallel and Distributed Systems
No ratings yet
DSECL ZG 522: Big Data Systems: Session 2: Parallel and Distributed Systems
58 pages
CS213 Parallel Processing Syllabus
No ratings yet
CS213 Parallel Processing Syllabus
26 pages
5 4 Parallel
No ratings yet
5 4 Parallel
47 pages
Red Hat Enterprise Linux-7-Performance Tuning Guide-en-US PDF
No ratings yet
Red Hat Enterprise Linux-7-Performance Tuning Guide-en-US PDF
115 pages
PP16 Lec4 Arch3
No ratings yet
PP16 Lec4 Arch3
23 pages
Parallel Computing Concepts Guide
No ratings yet
Parallel Computing Concepts Guide
32 pages
Unit I - BE v-ACA - 2024 - Lecture Notes
No ratings yet
Unit I - BE v-ACA - 2024 - Lecture Notes
138 pages
Advanced Computer Architecture
No ratings yet
Advanced Computer Architecture
47 pages
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
No ratings yet
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
26 pages
AL3452 Operating Systems Lecture Notes 1 32
No ratings yet
AL3452 Operating Systems Lecture Notes 1 32
32 pages
Multiprocessing Wiki 20150330
No ratings yet
Multiprocessing Wiki 20150330
96 pages
Shared Memory. Distributed Memory. Hybrid Distributed-Shared Memory
No ratings yet
Shared Memory. Distributed Memory. Hybrid Distributed-Shared Memory
22 pages
Lec1 Introduction To Parallel Computing
No ratings yet
Lec1 Introduction To Parallel Computing
40 pages
AOS Notes
No ratings yet
AOS Notes
61 pages
Parallel Computing Platforms and Memory System Performance: John Mellor-Crummey
No ratings yet
Parallel Computing Platforms and Memory System Performance: John Mellor-Crummey
43 pages
Operating Systems - AL3452 - Notes - Unit 1 - Introduction
No ratings yet
Operating Systems - AL3452 - Notes - Unit 1 - Introduction
36 pages
SISd
No ratings yet
SISd
17 pages
Internal Assignment: Name Sneha Sankhla Roll Number 2214505216 Program Master of Computer Applications (Mca) Semester 1
No ratings yet
Internal Assignment: Name Sneha Sankhla Roll Number 2214505216 Program Master of Computer Applications (Mca) Semester 1
13 pages
Distributed Memory Systems Guide
No ratings yet
Distributed Memory Systems Guide
39 pages
Computer System Evaluation Techniques
No ratings yet
Computer System Evaluation Techniques
17 pages
10987C ENU PowerPoint
No ratings yet
10987C ENU PowerPoint
278 pages
Flynn's Classification
No ratings yet
Flynn's Classification
46 pages
Advanced Shared Memory Systems
No ratings yet
Advanced Shared Memory Systems
25 pages
ASE On VMWare Vshpere
No ratings yet
ASE On VMWare Vshpere
24 pages
Quickspecs: HP Z600 Workstation HP Z600 Workstation HP Z600 Workstation HP Z600 Workstation
No ratings yet
Quickspecs: HP Z600 Workstation HP Z600 Workstation HP Z600 Workstation HP Z600 Workstation
61 pages
Intel Xeon Processor E7-8800/4800/2800 Product Families: Datasheet Volume 2 of 2
No ratings yet
Intel Xeon Processor E7-8800/4800/2800 Product Families: Datasheet Volume 2 of 2
50 pages
Computer Science Thesis: Parallelizing Dijkstra
No ratings yet
Computer Science Thesis: Parallelizing Dijkstra
63 pages
Operating Systems - Ch6 - Mod - Reem
No ratings yet
Operating Systems - Ch6 - Mod - Reem
36 pages
Optimize System Bandwidth For HPC Ai Micron CXL Intel Xeon Whitepaper
No ratings yet
Optimize System Bandwidth For HPC Ai Micron CXL Intel Xeon Whitepaper
8 pages
Guidelines Data Warehousing Design
No ratings yet
Guidelines Data Warehousing Design
3 pages
Architecture Question Bank
No ratings yet
Architecture Question Bank
5 pages

PARALLEL PROGRAMMING Module 1

Uploaded by

PARALLEL PROGRAMMING Module 1

Uploaded by

PARALLEL

• Explore the need for parallel programming

• Explain how to parallelize on MIMD systems

• To demonstrate how to apply OpenMP pragma and directives to parallelize the

• To demonstrate how to design CUDA program

Introduction to parallel programming, Parallel hardware and parallel

• It increases computational speed and efficiency by utilizing multiple

• Important in modern computing due to multi-core processors and high-

 It enables faster execution of complex computations by distributing the

2 Parallel software is designed to execute multiple tasks simultaneously

 It complements parallel hardware to achieve high performance, scalability,

 Requires special programming models, libraries, and design techniques.

2. Memory Access Model

Consider the loop: for (i = 0; i < n; i++) x[i] += y[i];

MIMD (Multiple Instruction, Multiple Data) systems run multiple

• Each processor/core operates independently with its own control unit

• Processors are asynchronous and operate at their own pace.

• Useful for task parallelism and complex computing systems

• Each cpu has its own private memory

• Interconnect links processors and memory.

•Earlier systems used a bus to connect processors and memory.

Switched Interconnects (Crossbar)

(b): Configuration of internal switches in a crossbar

You might also like