0% found this document useful (0 votes)

86 views32 pages

Lecture15 PDF

The document discusses parallel computing paradigms like distributed memory and GPU computing. It introduces MPI (Message Passing Interface) as the standard for exchanging data between processors. MPI uses calls to subroutines to control data exchange between CPUs. The document provides examples of MPI routines like Broadcast and Reduce. It also discusses how to structure Fortran code for MPI and provides an example of using MPI to compute an integral in parallel.

Uploaded by

Daniel Mora

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

86 views32 pages

Lecture15 PDF

Uploaded by

Daniel Mora

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

Introduction to High Performance Scientific Computing

Autumn, 2016

Lecture 15

Imperial College Prasun Ray

London 28 November 2016
Parallel computing paradigms

Distributed memory
•  Each (4-core) chip has its own memory

•  The chips are connected by network ‘cables’

•  MPI coordinates communication between two or more CPUs

Imperial College
London
Parallel computing paradigms

Related approaches:
•  Hybrid programming: mix of shared-memory (OpenMP) and
distributed-memory (MPI) programming

•  GPU’s: Shared memory programming (CUDA or OpenCL)

•  Coprocessors and co-array programming

Imperial College
London
MPI intro
•  MPI: Message Passing Interface

•  Standard for exchanging data between processors

•  Supports Fortran, c, C++

•  Can also be used with Python

Imperial College
London
OpenMP schematic
Program starts with
single master thread

Then, launch parallel Start program

region with multiple
threads.
master thread
Each thread has
access to all FORK
variables introduced
previously Parallel region (4 threads)

Can end parallel JOIN

region if/when
desired and launch
Serial region (1 thread)
parallel regions again
in future as needed

Imperial College
London
MPI schematic
Program starts with
all processes
running
Start program
MPI controls
communication
between processes
Parallel region (4 processes)

Imperial College
London
MPI intro
•  Basic idea: calls to MPI subroutines control data exchange
between processors

•  Example:

call MPI_BCAST(n, 1, MPI_INTEGER,0,MPI_COMM_WORLD,ierr)

This will send the integer n which has size 1 from processor 0 to all
of the other processors.

Imperial College
London
MPI broadcast

P0 data P0 data

P1 P1 data

P2 P2 data

P3 P3 data

Imperial College
London
MPI intro
•  Basic idea: calls to MPI subroutines control data exchange
between processors

•  Example:

call MPI_BCAST(n, 1, MPI_INTEGER,0,MPI_COMM_WORLD,ierr)

This will send the integer n which has size 1 from processor 0 to all
of the other processors.

Generally, need to specify:

•  source and/or destination of message
•  size of data contained in message
•  type of data contained in message (integer, double precision, …)
•  the data itself (or its location)

Imperial College
London
Fortran code structure
! Basic Fortran 90 code structure!
!
!1. Header!
program template!
!
!2. Variable declarations (e.g. integers, real numbers,...)!
!
!3. basic code: input, loops, if-statements, subroutine calls!
print *, 'template code'!
!
!
!4. End program!
end program template!
!
! To compile this code:!
! $ gfortran -o f90template.exe f90template.f90!
! To run the resulting executable: $ ./f90template.exe

Imperial College
London
MPI intro
! Basic MPI + Fortran 90 code structure! See mpif90template.f90
!
!1. Header!
program template!
use mpi!
!
!2a. Variable declarations (e.g. integers, real numbers,...)!
integer :: myid, numprocs, ierr!
!
!2b. Initialize MPI!
call MPI_INIT(ierr)!
call MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr)!
call MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr)!
!
!3. basic code: input, loops, if-statements, subroutine calls!
print *, 'this is proc # ',myid, 'of ', numprocs!
!
!
!4. End program!
call MPI_FINALIZE(ierr)!
end program template!
!
! To compile this code:!
! $ mpif90 -o mpitemplate.exe mpif90template.f90!
! To run the resulting executable with 4 processes:$ mpiexec -n 4 mpitemplate.exe
Imperial College
London
MPI intro
! Basic MPI + Fortran 90 code structure! See mpif90template.f90
!
!1. Header!
program template!
use mpi!
!
!2a. Variable declarations (e.g. integers, real numbers,...)!
integer :: myid, numprocs, ierr!
!
!2b. Initialize MPI!
call MPI_INIT(ierr)!
call MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr)!
call MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr)!
!
!3. basic code: input, loops, if-statements, subroutine calls!
print *, 'this is proc # ',myid, 'of ', numprocs!
!
!
!4. End program!
call MPI_FINALIZE(ierr)!
end program template!
!
! To compile this code:!
! $ mpif90 -o mpitemplate.exe mpif90template.f90!
! To run the resulting executable with 4 processes:$ mpiexec -n 4 mpitemplate.exe
Imperial College
London
MPI intro
! Basic MPI + Fortran 90 code structure! See mpif90template.f90
!
!1. Header!
program template!
use mpi!
!
!2a. Variable declarations (e.g. integers, real numbers,...)!
integer :: myid, numprocs, ierr!
!
!2b. Initialize MPI!
call MPI_INIT(ierr)!
call MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr)!
call MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr)!
!
!3. basic code: input, loops, if-statements, subroutine calls!
print *, 'this is proc # ',myid, 'of ', numprocs!
!
!
!4. End program!
call MPI_FINALIZE(ierr)!
end program template!
!
! To compile this code:!
! $ mpif90 -o mpitemplate.exe mpif90template.f90!
! To run the resulting executable with 4 processes:$ mpiexec -n 4 mpitemplate.exe
Imperial College
London
MPI intro
•  Compile + run:

$ mpif90 -o mpif90template.exe mpif90template.f90!

!
$ mpiexec -n 4 mpif90template.exe!
this is proc # 0 of 4!
this is proc # 3 of 4!
this is proc # 1 of 4!
this is proc # 2 of 4!

Note: The number of processes specified with mpiexec can be

larger than the number of cores on your machine, but then tasks are
run sequentially.

Imperial College
London
MPI+Fortran example: computing an integral

•  Estimate integral
with midpoint rule,

Imperial College
London
MPI+Fortran quadrature
Two most important tasks:

1.  Decide how many intervals per processor

2.  Each processor will compute its own partial sum, sum_proc,
how do we compute sum(sum_proc)?

Imperial College
London
MPI+Fortran quadrature
Two most important tasks:

1.  Decide how many intervals per processor

2.  Each processor will compute its own partial sum, sum_proc,
how do we compute sum(sum_proc)?

•  N = number of intervals

•  numprocs = number of processors

•  Need to compute Nper_proc: intervals per processor

Imperial College
London
MPI+Fortran quadrature
•  N = number of intervals

•  numprocs = number of processors

•  Need to compute Nper_proc: intervals per processor

§  Basic idea: if N = 8 * numprocs, Nper_proc = 8

§  But, if N <= numprocs, N/numprocs = 0

Nper_proc = (N + numprocs – 1)/numprocs

Imperial College
London
MPI+Fortran quadrature
Two most important tasks:

1.  Decide how many intervals per processor

2.  Each processor will compute its own partial sum, sum_proc,
how do we compute sum(sum_proc)?

Use MPI_REDUCE

Imperial College
London
MPI reduce

P0 data1 Reduction P0 result

P1 data2 P1

P2 data3 P2

P3 data4 P3

Imperial College
London
MPI+Fortran quadrature
Two most important tasks:

1.  Decide how many intervals per processor

2.  Each processor will compute its own partial sum, sum_proc,
how do we compute sum(sum_proc)?

•  Use MPI_REDUCE

•  Reduction options: MPI_MAX, MPI_MIN, MPI_SUM, MPI_PROD

Imperial College
London
MPI+Fortran quadrature
Two most important tasks:

1.  Decide how many intervals per processor

2.  Each processor will compute its own partial sum, sum_proc,
how do we compute sum(sum_proc)?

•  Use MPI_REDUCE

•  Reduction options: MPI_MAX, MPI_MIN, MPI_SUM, MPI_PROD

•  For quadrature, we need MPI_SUM

Imperial College
London
MPI+Fortran quadrature

For quadrature, we need MPI_SUM:

call MPI_REDUCE(data, result, 1, MPI_DOUBLE_PRECISION,

0,MPI_COMM_WORLD,ierr)

This will:

1.  Collect the double precision variable data which has size 1 from
each processor.

2.  Compute the sum (because we have chosen and store

the value in result on processor 0.

Note: Only processor 0 will have the final sum. With

MPI_ALLREDUCE, the result will be on every processor.

Imperial College
London
MPI+Fortran quadrature
midpoint_p.f90: distribute data

!set number of intervals per processor!

Nper_proc = (N + numprocs - 1)/numprocs!
!
!starting and ending points for processor!
istart = myid * Nper_proc + 1!
iend = (myid+1) * Nper_proc!
if (iend>N) iend = N!
!

Imperial College
London
MPI+Fortran quadrature
midpoint_p.f90: 1. distribute data, 2. compute sum_proc

!set number of intervals per processor!

Imperial College
London
MPI+Fortran quadrature
midpoint_p.f90: 1. distribute data, 2. compute sum_proc, 3. reduction

!set number of intervals per processor!

Nper_proc = (N + numprocs - 1)/numprocs!
!
!starting and ending points for processor!
istart = myid * Nper_proc + 1!
iend = (myid+1) * Nper_proc!
if (iend>N) iend = N!
!
!loop over intervals computing each interval's contribution to integral!
do i1 = istart,iend!
xm = dx*(i1-0.5) !midpoint of interval i1!
call integrand(xm,f)!
sum_i = dx*f!
sum_proc = sum_proc + sum_i !add contribution from interval to
total integral!
end do!
!collect double precision variable, sum, with size 1 on process 0 using
the MPI_SUM option!
call MPI_REDUCE(sum_proc,sum,1,MPI_DOUBLE_PRECISION,MPI_SUM,
0,MPI_COMM_WORLD,ierr)
Imperial College
London
MPI+Fortran quadrature
Compile and run:

$ mpif90 -o midpoint_p.exe midpoint_p.f90!

!
$ mpiexec -n 2 midpoint_p.exe !
number of intervals = 1000!
number of procs = 2!
Nper_proc= 500!
The partial sum on proc # 0 is: 1.8545905426699112 !
The partial sum on proc # 1 is: 1.2870021942532193 !
N= 1000!
sum= 3.1415927369231307 !
error= 8.3333337563828991E-008!
!

Imperial College
London
Other collective operations
•  Scatter and gather

Imperial College
London
MPI scatter

P0 [f1,f2,f3,f4] P0 f1

P1 P1 f2

P2 P2 f3

P3 P3 f4

Imperial College
London
MPI gather

P0 [f1,f2,f3,f4] P0 f1

P1 P1 f2

P2 P2 f3

P3 P3 f4

Imperial College
London
k 1
Ti+1 2Tik + Tik 11
Other Other collective
collective
2
= operations
Si
operations
x
2
x 1 k 1
•  •  Scatter
Scatter and k and gather
Ti gather
= Si + Ti+1 + Tik 11
2 2
dT •  Gather
•  iGather allTparticles
all particles 2Ton
i+ Ti 1
processor
i+1 on processor
=•  S
•  Compute i (t)interaction
Compute + interaction
forcesforces
2 for,particles
i =on1,that
for particles 2,
on..., Nprocessor
that
processor
dt x
2 XN
d xi
2
= f (|xi xj |), i = 1, 2, ..., N
dt j=1

•  Avoid
•  Avoid forproblems
for big big problems (why?)
(why?)

Imperial College
Imperial College
London London
MPI collective data movement

Imperial College
London
From Using MPI

SCC800-S1-Site Controller Datasheet
100% (4)
SCC800-S1-Site Controller Datasheet
2 pages
MPI Tutorial Fall Break 2022
No ratings yet
MPI Tutorial Fall Break 2022
60 pages
2 Mpi
No ratings yet
2 Mpi
13 pages
MPI Plamen Krastev
No ratings yet
MPI Plamen Krastev
49 pages
MPI Tutorial: MPI (Message Passing Interface)
No ratings yet
MPI Tutorial: MPI (Message Passing Interface)
29 pages
MPI Tutorial: MPI (Message Passing Interface)
No ratings yet
MPI Tutorial: MPI (Message Passing Interface)
29 pages
MPI Tutorial: Basics & Examples
No ratings yet
MPI Tutorial: Basics & Examples
29 pages
Parallel & Distributed Computing: MPI - Message Passing Interface
No ratings yet
Parallel & Distributed Computing: MPI - Message Passing Interface
49 pages
Mpi 1
No ratings yet
Mpi 1
38 pages
5 MPIprogramming
No ratings yet
5 MPIprogramming
43 pages
Parallel Programming Using Basic MPI Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center
No ratings yet
Parallel Programming Using Basic MPI Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center
19 pages
Lab Mpi
No ratings yet
Lab Mpi
32 pages
Lecture 11 Distributed Memory Programming
No ratings yet
Lecture 11 Distributed Memory Programming
28 pages
Intro MPI
No ratings yet
Intro MPI
60 pages
Intro to Basic MPI Parallel Programming
No ratings yet
Intro to Basic MPI Parallel Programming
19 pages
1.hello World Programme in Mpi
No ratings yet
1.hello World Programme in Mpi
11 pages
Mpi Openmp Examples
No ratings yet
Mpi Openmp Examples
27 pages
Intro To MPI
No ratings yet
Intro To MPI
44 pages
Introduction To MPI Ranger Lonestar
No ratings yet
Introduction To MPI Ranger Lonestar
67 pages
Lab Mpi
No ratings yet
Lab Mpi
29 pages
Sunil Kumar L 24
No ratings yet
Sunil Kumar L 24
21 pages
SERC IntroMPI 2019-09-14 v0
No ratings yet
SERC IntroMPI 2019-09-14 v0
43 pages
Introduction To Parallel Computing: What Is Parallel Computing? CS 480 - II Parallel and Scientific Computing
No ratings yet
Introduction To Parallel Computing: What Is Parallel Computing? CS 480 - II Parallel and Scientific Computing
10 pages
Lecture 15 MPI Summarization
No ratings yet
Lecture 15 MPI Summarization
26 pages
Parallel Programming and MPI
No ratings yet
Parallel Programming and MPI
54 pages
HPC - NRW 02 MPI Concepts
No ratings yet
HPC - NRW 02 MPI Concepts
27 pages
Parallel Programming Using MPI
No ratings yet
Parallel Programming Using MPI
69 pages
ATPESC 2019 Track-2 1-7-30 830am Guo-Raffenetti-Thakur-MPI For Scalable Computing
No ratings yet
ATPESC 2019 Track-2 1-7-30 830am Guo-Raffenetti-Thakur-MPI For Scalable Computing
199 pages
Message Passing Interface (MPI) : EC3500: Introduction To Parallel Computing
100% (1)
Message Passing Interface (MPI) : EC3500: Introduction To Parallel Computing
40 pages
Lecture 1
No ratings yet
Lecture 1
23 pages
Lab Assesment 9 Parallel & Distributed Computing (L31+32) : Dated: 16/10/2020 Assessment 9 Muskan Agrawal 18BCE0707
No ratings yet
Lab Assesment 9 Parallel & Distributed Computing (L31+32) : Dated: 16/10/2020 Assessment 9 Muskan Agrawal 18BCE0707
4 pages
An Introduction To MPI: Parallel Programming With The Message Passing Interface
No ratings yet
An Introduction To MPI: Parallel Programming With The Message Passing Interface
48 pages
Mpi Half Day Public
No ratings yet
Mpi Half Day Public
140 pages
Code: First Method:: (1) Write A C Program Using Open MP To Estimate The Value of PI (Use Minimum Two Methods)
No ratings yet
Code: First Method:: (1) Write A C Program Using Open MP To Estimate The Value of PI (Use Minimum Two Methods)
8 pages
Using MPI
No ratings yet
Using MPI
385 pages
Distributed Memory Programming With MPI: Peter Pacheco
No ratings yet
Distributed Memory Programming With MPI: Peter Pacheco
121 pages
PA
No ratings yet
PA
87 pages
Pcap Cse 3263 Lab Manual 2023
No ratings yet
Pcap Cse 3263 Lab Manual 2023
70 pages
Mpi 1
No ratings yet
Mpi 1
20 pages
Distributed Memory Programming With: Peter Pacheco
No ratings yet
Distributed Memory Programming With: Peter Pacheco
125 pages
Fortran90 Parallel Trapezoid Guide
No ratings yet
Fortran90 Parallel Trapezoid Guide
4 pages
MPI Programming Guide
No ratings yet
MPI Programming Guide
74 pages
03 MPIProgramStructure
No ratings yet
03 MPIProgramStructure
42 pages
Mpi Openmp Handouts
No ratings yet
Mpi Openmp Handouts
67 pages
ECE 1747H: Parallel Programming: Message Passing (MPI)
No ratings yet
ECE 1747H: Parallel Programming: Message Passing (MPI)
67 pages
PDCLabMan Updated
No ratings yet
PDCLabMan Updated
46 pages
MPI Parallel Programming Guide
No ratings yet
MPI Parallel Programming Guide
67 pages
High Performance Computing For Computational Mechanics: ISCM-10
No ratings yet
High Performance Computing For Computational Mechanics: ISCM-10
63 pages
Computing LLNL Gov
No ratings yet
Computing LLNL Gov
42 pages
Lec 9 DR Marwa Abbas
No ratings yet
Lec 9 DR Marwa Abbas
64 pages
Clase 4 - Tutorial de MPI
No ratings yet
Clase 4 - Tutorial de MPI
35 pages
MPI Lab 3
No ratings yet
MPI Lab 3
18 pages
OpenMP and MPI Parallel Programming Guide
No ratings yet
OpenMP and MPI Parallel Programming Guide
23 pages
Message Passing Interface (MPI)
No ratings yet
Message Passing Interface (MPI)
22 pages
3.introduction To Parallelism
No ratings yet
3.introduction To Parallelism
64 pages
Parallel and Distributed Computing Lab Digital Assignment - 5
No ratings yet
Parallel and Distributed Computing Lab Digital Assignment - 5
7 pages
Lecture07 MPI by Example
No ratings yet
Lecture07 MPI by Example
27 pages
MPI Pacheco Ch3
No ratings yet
MPI Pacheco Ch3
124 pages
Wealth CalculatorIS
No ratings yet
Wealth CalculatorIS
10 pages
Bifd Programme Version 20 July 2022
No ratings yet
Bifd Programme Version 20 July 2022
36 pages
Bifd Programme Final
No ratings yet
Bifd Programme Final
38 pages
Italian For Beginners: 100 Phrases Every Italian Beginner Must Know
100% (2)
Italian For Beginners: 100 Phrases Every Italian Beginner Must Know
25 pages
Erc 2021 Adg Results Pe
No ratings yet
Erc 2021 Adg Results Pe
15 pages
Presentation Hints PDF
No ratings yet
Presentation Hints PDF
2 pages
Lecture 2.1 - Image Processing Image Filtering: Idar Dyrdal
No ratings yet
Lecture 2.1 - Image Processing Image Filtering: Idar Dyrdal
38 pages
Core Course 1: Fluid Dynamics 1: Appendix I: Core Courses
No ratings yet
Core Course 1: Fluid Dynamics 1: Appendix I: Core Courses
14 pages
01) CP Training-STARDOM Introduction
No ratings yet
01) CP Training-STARDOM Introduction
64 pages
Naresh Babu Gatti SP Resume
No ratings yet
Naresh Babu Gatti SP Resume
1 page
Motorola Software R&D Freshers Job Description
No ratings yet
Motorola Software R&D Freshers Job Description
1 page
Chip Simulation of Automotive Ecus: Jakob Mauss, Matthias Simons
No ratings yet
Chip Simulation of Automotive Ecus: Jakob Mauss, Matthias Simons
9 pages
UNP-GuidelinesJuly2024 240910 105826
No ratings yet
UNP-GuidelinesJuly2024 240910 105826
2 pages
CSE130 Asgn2 Winter23 v7-1
No ratings yet
CSE130 Asgn2 Winter23 v7-1
10 pages
Cloud Virtualization Essentials
No ratings yet
Cloud Virtualization Essentials
39 pages
Wacs 7000
No ratings yet
Wacs 7000
70 pages
Commands
No ratings yet
Commands
4 pages
10.1 - Room, LiveData, and ViewModel
No ratings yet
10.1 - Room, LiveData, and ViewModel
85 pages
Steganography (Final) 1
No ratings yet
Steganography (Final) 1
34 pages
Brain Tumor Classification FirstReport
No ratings yet
Brain Tumor Classification FirstReport
42 pages
Hamming Code
No ratings yet
Hamming Code
13 pages
IQReport 20231115 080853
No ratings yet
IQReport 20231115 080853
44 pages
Eegame Logcat
No ratings yet
Eegame Logcat
40 pages
Laynier CV 2023 - en
No ratings yet
Laynier CV 2023 - en
3 pages
Data Structure 2 Lyst1728905520227
No ratings yet
Data Structure 2 Lyst1728905520227
344 pages
Catalog EcoStruxure Machine SCADA Expert - Lite SCADA For Line Management
No ratings yet
Catalog EcoStruxure Machine SCADA Expert - Lite SCADA For Line Management
15 pages
FP Growth Algorithm
No ratings yet
FP Growth Algorithm
17 pages
Considerations For Remote Working With NX: 1.1 Caveat
No ratings yet
Considerations For Remote Working With NX: 1.1 Caveat
8 pages
Beige Gradient Minimalist Modern Digital Marketer Resume
No ratings yet
Beige Gradient Minimalist Modern Digital Marketer Resume
1 page
1512-Article Text-5409-1-10-20230117
No ratings yet
1512-Article Text-5409-1-10-20230117
9 pages
References
No ratings yet
References
13 pages
Lumu Ransomware Response Playbook
100% (1)
Lumu Ransomware Response Playbook
10 pages
Pe NWC
No ratings yet
Pe NWC
17 pages
RISC-V Unprivileged Architecture Manual
No ratings yet
RISC-V Unprivileged Architecture Manual
236 pages
Hoapn CV
No ratings yet
Hoapn CV
3 pages
Interdisciplinary Approach To P vs. NP: Integrating Set Theory, Graph Theory, and Quantum Mechanics
0% (1)
Interdisciplinary Approach To P vs. NP: Integrating Set Theory, Graph Theory, and Quantum Mechanics
15 pages
12.1.2 Lab - Implement BGP Path Manipulation - ITExamAnswers
No ratings yet
12.1.2 Lab - Implement BGP Path Manipulation - ITExamAnswers
22 pages

Lecture15 PDF

Uploaded by

Lecture15 PDF

Uploaded by

Introduction to High Performance Scientific Computing

Imperial College Prasun Ray

• The chips are connected by network ‘cables’

• MPI coordinates communication between two or more CPUs

• GPU’s: Shared memory programming (CUDA or OpenCL)

• Coprocessors and co-array programming

• Standard for exchanging data between processors

• Supports Fortran, c, C++

• Can also be used with Python

Then, launch parallel Start program

Can end parallel JOIN

call MPI_BCAST(n, 1, MPI_INTEGER,0,MPI_COMM_WORLD,ierr)

call MPI_BCAST(n, 1, MPI_INTEGER,0,MPI_COMM_WORLD,ierr)

Generally, need to specify:

$ mpif90 -o mpif90template.exe mpif90template.f90!

Note: The number of processes specified with mpiexec can be

1. Decide how many intervals per processor

1. Decide how many intervals per processor

• numprocs = number of processors

• Need to compute Nper_proc: intervals per processor

• numprocs = number of processors

• Need to compute Nper_proc: intervals per processor

§ Basic idea: if N = 8 * numprocs, Nper_proc = 8

§ But, if N <= numprocs, N/numprocs = 0

Nper_proc = (N + numprocs – 1)/numprocs

1. Decide how many intervals per processor

P0 data1 Reduction P0 result

1. Decide how many intervals per processor

• Reduction options: MPI_MAX, MPI_MIN, MPI_SUM, MPI_PROD

1. Decide how many intervals per processor

• Reduction options: MPI_MAX, MPI_MIN, MPI_SUM, MPI_PROD

• For quadrature, we need MPI_SUM

For quadrature, we need MPI_SUM:

call MPI_REDUCE(data, result, 1, MPI_DOUBLE_PRECISION,

2. Compute the sum (because we have chosen and store

Note: Only processor 0 will have the final sum. With

!set number of intervals per processor!

!set number of intervals per processor!

!set number of intervals per processor!

$ mpif90 -o midpoint_p.exe midpoint_p.f90!

You might also like

•  The chips are connected by network ‘cables’

•  MPI coordinates communication between two or more CPUs

•  GPU’s: Shared memory programming (CUDA or OpenCL)

•  Coprocessors and co-array programming

•  Standard for exchanging data between processors

•  Supports Fortran, c, C++

•  Can also be used with Python

1.  Decide how many intervals per processor

1.  Decide how many intervals per processor

•  numprocs = number of processors

•  Need to compute Nper_proc: intervals per processor

•  numprocs = number of processors

•  Need to compute Nper_proc: intervals per processor

§  Basic idea: if N = 8 * numprocs, Nper_proc = 8

§  But, if N <= numprocs, N/numprocs = 0

1.  Decide how many intervals per processor

1.  Decide how many intervals per processor

•  Reduction options: MPI_MAX, MPI_MIN, MPI_SUM, MPI_PROD

1.  Decide how many intervals per processor

•  Reduction options: MPI_MAX, MPI_MIN, MPI_SUM, MPI_PROD

•  For quadrature, we need MPI_SUM

2.  Compute the sum (because we have chosen and store