0% found this document useful (0 votes)

21 views13 pages

Assignment (T)

The document discusses improving the performance of two computational codes through parallelization. For the first code, a Monte Carlo pi calculation, the MPI version showed near-linear speedup as processes were increased from 1 to 8. Communication overhead became the limiting factor as processes approached the number of calculations. With millions of calculations, speedup could scale to millions of nodes. The second code, a 3D heat conduction simulation, was parallelized with OpenMP. Dividing the computation across the k-loop performed faster than the i-loop due to less data redistribution. Private variables were required depending on the parallelized loop.

Uploaded by

shs5feb

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views13 pages

Assignment (T)

Uploaded by

shs5feb

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

High Performance Compu ng

1. A sequen al code of Monte Carlo Pi calcula on is given in the PDF le of MPI programming. Write its
MPI version and run it on a parallel computer to discuss the performance improvement and
scalability by increasing the number of MPI processes up to 8. In the MPI code, the execu on me of
the Pi calcula on must be measured by properly using MPI_W me and be printed out (prin ) at the
end of the execu on. What is the limi ng factor of its performance? Will it scale to a large-scale
system of million nodes?

The MPI version of the program is shown on the next page tled “pi_mpi.c”. A er compiling the code,
it was executed on a parallel computer by varying the number of processes up to 8. The execu on
me, speedup, and e ciency are summarized in Table 1.

Table 1. Summary of the execu on me, speedup, and e ciency for several number of processes of
Monte Carlo MPI program.

Number of Execu on me (secs) Speedup E ciency

Process
1 (sequen al) 10.91626 1.00000 1
2 5.458785 1.99976 0.99988
3 3.639033 2.99977 0.99992
4 2.735953 3.98993 0.99748
5 2.183623 4.99915 0.99983
6 1.819824 5.99853 0.99975
7 1.563699 6.98105 0.99729
8 1.368491 7.97686 0.99711

Figure 1. E ciency vs number of Processes of Pi_MPI.

Upon running this Monte Carlo simula on, the number of N was xed to 100000000. The only
communica on required for this program is to gather all the results in the very ending of the
program. There will be data as many as the number of processes to be sent. The general trend, as can
be seen from Fig. 1, is that the e ciency decreases as the number of processes increase. However,
since for this par cular case a huge number of N is used, the communica on percentage is very
minor to the process of genera ng many random numbers. As the number of processes increase, the
random numbers that should be generated by each process become less and the data to be
transferred at the end will increase. We could see then that the communica on overhead becomes
the limi ng factor, even though it can only be felt when the number of the processes is close to the
ti
ffi
ti
ti
ti
ti
ti
ffi
ti
ti
ti
ti
ffi
ti
ti
ti
ffi
ti
ti
ti
ti
ffi
ti
ti
fi
fi
ti
ti
ft
ti
tf
ti
ti
order of N. Nevertheless, since our program to nd pi is basically embarrassingly parallel, it is thus
clear that more processes can be added up to million nodes as long as it below N , and a speedup can
be obtained in theory.

2. Download a sequen al code of nal.c available at the class web page. It is a simpli ed version of 3-
dimensional heat conduc on simula on. The code is also shown on the next page. The computa on
results on one slice of the 3-dimensional space are printed out as an image le in the PGM format.

I. Parallelize the code with OpenMP, considering which variables should be private. You can check
the correctness by comparing the output data with those of the original code.

The OpenMP code is shown on the next page tled “heat_omp.c”.

There are actually several ways to apply OpenMP to the current problem. First, we can divide the
k-loop into a number of threads. By doing so, the variable “j” and “i” are required to be private.
We can also divide the j-loop (thus requiring “i” to be private), as well as dividing i-loop (no
private variables are required). In comparing the two extreme cases, i.e. dividing k-loop and i-
loop, the par ons on k-loop performs faster compared to the i-loop par ons. This is since for
every i-loop par on, the compiler is required to redistribute the task for each thread, while k-
loop par on only requires distribu on once. The minor disadvantage of k-loop par on is that
it stores private “i” and “j” for every thread. For a comparison with 4 threads, k-loop par on
requires 194.84 secs while i-loop par on require 341.22 secs. The program a ached in this le
is the k-loop par on with OpenMP.

The comparison of the output.pgm and output_omp.pgm (Fig.2 and Fig. 3) verify that the
OpenMP version of the program produces the correct results.
ti
ti
ti
ti
ti
ti
ti
ti
ti
ti
fi
ti
ti
ti
ti
ti
fi
ti
ti
fi
tt
fi
ti
ti
ti
ti
fi
ti
Figure 2. Sequen al result Figure 3. OpenMP result

II. Under the assump on of a speci c domain decomposi on method, describe necessary
communica ons if the code is parallelized with MPI.

The domain decomposi on was performed on the highest level loop, i.e. “k”. Thus, every process
only calculates T from kini al un l k nish. However, for every me step, the informa on of the
temperature on kini al-1 and k nisih+1 from the previous me step is required. This is where the
communica on becomes necessary. The schema c of this communica on is depicted on Fig. 4.
This communica on u lized peer-to-peer communica on without blocking.

Figure 4. Schema c of the communica on

ti
ti
ti
ti
ti
ti
ti
ti
ti
fi
ti
ti
ti
ti
ti
ti
ti
ti
A er reaching me TMAX, each of the processes stores the current value of temperature from
kini al un l k nish. Thus, to complete the program, gather opera on can be performed to collect all
of the temperature values ranging from 0 un l KMAX. Upon wri ng this MPI version of the heat
transfer, since the print() func on only prints par cular slice of the data, a technique to
determine which process should perform the print() func on was used rather than gather
opera on, to reduce me.

III. Write an MPI version of the code, and discuss the scalability and parallel e ciency of the
program considering the communica on overhead.

The MPI version of this code is shown on the next page tled “heat_mpi.c” and the result is
shown in Fig. 5. As can be understood from the code a ached, it can be seen that the current
MPI code can only be run on 3 or more nodes. The execu on me, speedup, and e ciency for
1,3 and 4 processes are summarized in Table 2 and the trend is illustrated through Fig. 6.

Figure 5. MPI result

Table 2. Summary of the execu on me, speedup, and e ciency for several number of processes of
heat transfer MPI program.

Number of Execu on me (secs) Speedup E ciency

Process
1 (sequen al) 734.36 1 1
ffi
ft
ti
ti
ti
ti
ti
ti
ti
ti
ti
ti
ti
ti
ti
ffi
tt
ti
ti
ti
ti
ti
ti
ffi
ffi
3 305.93 2.400 0.800
4 255.33 2.876 0.719

E ciency vs Number of Processes of Heat_MPI

0.75
E ciency

0.5

0.25

0
0 1 2 3 4
Number of Processes

Figure 6. E ciency vs Number of Processes of Heat_MPI

From the trend, we can see that the e ciency greatly decreases with the number of processes.
This is since for every addi onal process, the number of the communica on increases as much
as 2*JMAX*IMAX. The more the number of processes, the more the communica ons which
increases the overhead. There will be a point where the e ciency is lower than 0, i.e. when
there will be no more speedup gain. Usually, to nd out the op mum processes required is
achieved through trial and error for every speci c problem.

IV. Is there any other idea to improve the parallel e ciency of this program?

To improve the parallel e ciency of this par cular heat problem, a way to reduce the
communica on overhead should be done. One of the possibili es is to reduce the surface area
of the interface between the boundary of each par on. The par cular example for the case of 4
processes is depicted in Fig. 7. On the le is the methodology upon wri ng the program, i.e. k-
loop par on. On the right is the proposed idea to improve the parallel e ciency. As can be
seen, the interface of the boundaries between each par ons on the le gure is 3*IMAX*JMAX
while on the right we see that the area of the interface is IMAX*JMAX + KMAX*JMAX. Since
IMAX, JMAX, KMAX are equal, thus the communica on required for the proposed idea is 2/3 of
the original one. By reducing the communica on required, we can improve the e ciency for the
same number of processes. For more processes, the idea can be extended straigh orward.
ffi
ffi
ti
ti
ffi
ti
ti
ffi
ffi
ft
ti
fi
ffi
ti
ti
fi
ti
ti
ti
ti
ffi
ti
ti
ti
ft
ti
fi
ti
ffi
ffi
tf
ti
Figure 7. (a) original domain decomposi ons, (b) proposed domain decomposi ons
ti
ti
pi_mpi.c
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <time.h>

#define MASTER 0
int main(int argc,char* argv[])
{
int numtasks, taskid, loop;
int rc;
int N=100000000;
int i, total = 0;
double x,y,pi;
double etime;

/*MPI initialization*/
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
MPI_Comm_rank(MPI_COMM_WORLD, &taskid);
printf ("MPI task %d has started...\n", taskid);
MPI_Barrier(MPI_COMM_WORLD);
etime = -MPI_Wtime();
loop = N/numtasks; /*problem distribution*/

srand(time(NULL)*taskid);
for(i=0;i<loop;i++){
x = (double)rand()/RAND_MAX;
y = (double)rand()/RAND_MAX;
if (x*x + y*y < 1){
total=total+1;
}
}
printf("task id %d has total %d from %d loops\n",taskid,total,loop);
MPI_Barrier(MPI_COMM_WORLD);

int hometotal;
rc = MPI_Reduce(&total, &hometotal, 1, MPI_INT, MPI_SUM, 0,
MPI_COMM_WORLD); /*summing all points landing inside circle*/

/*final pi calculation*/
if(taskid==0){
pi = 4*(double)hometotal/N;
printf("pi is: %f\n",pi);
etime+= MPI_Wtime();
printf("elapsed time: %lf [sec]\n", etime);
}

MPI_Finalize();
return 0;
}

heat_omp.c
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <omp.h>

#define IMAX (512)

#define JMAX (512)
#define KMAX (512)
#define TMAX (500)

#define T(i,j,k,b) \
temp[(b)*IMAX*JMAX*KMAX+(k)*JMAX*IMAX+(j)*IMAX+(i)]

static void initmat(); /* initialize array */

static void print(); /* print one slice of 3d array */
static double gettime(); /* time measurement */

double* temp = NULL; /* temperature */

int main(int argc, char* argv)

{
int i,j,k,t; /* loop indecies */
int rb = 0, wb = 1; /* for double-buffering */
double stime; /* start */
double etime; /* end */

initmat(); /* setup the array */

stime = gettime();
/* kernel loop */
for(t=0;t<TMAX;t++){
/*omp definition*/
#pragma omp parallel private(j,i)
#pragma omp for
for(k=1;k<KMAX-1;k++){
for(j=1;j<JMAX-1;j++){
for(i=1;i<IMAX-1;i++){
T(i,j,k,wb) = (T(i,j,k,rb)
+ T(i,j,k-1,rb)
+ T(i,j,k+1,rb)
+ T(i,j-1,k,rb)
+ T(i,j+1,k,rb)
+ T(i-1,j,k,rb)
+ T(i+1,j,k,rb))/7;
}
}
}
rb = (rb==0?1:0);
wb = (wb==0?1:0); /* wb != rb */
fprintf(stderr,"%d\n",t);
}
etime = gettime();
fprintf(stderr,"elapsed: %lf [sec]\n",etime-stime);

print(); /* print out the result */

return 0;
}

void initmat(void)
{
int i,j,k;
temp = (double*)malloc(IMAX*JMAX*KMAX*2*sizeof(double));

for(k=0;k<KMAX;k++){
for(j=0;j<JMAX;j++){
for(i=0;i<IMAX;i++){
if(i==0 || i==IMAX-1){
T(i,j,k,0) = 255;
T(i,j,k,1) = 255;
}
else if(j==0 || j==JMAX-1){
T(i,j,k,0) = 255;
T(i,j,k,1) = 255;
}
else if(k==0 || k==KMAX-1){
T(i,j,k,0) = 255;
T(i,j,k,1) = 255;
}
else {
T(i,j,k,0) = 0;
T(i,j,k,1) = 0;
}
}
}
}
}

void print(void)
{
FILE *out_file = fopen("output_omp.txt","w");
int i,j,k;

fprintf(out_file,"P2\n%d %d 255\n",IMAX,JMAX);
for(j=0;j<JMAX;j++)
for(i=0;i<IMAX;i++)
fprintf(out_file,"%d ",(unsigned char)T(i,j,KMAX/3,0));
}

double gettime()
{
struct timeval tv;
gettimeofday(&tv,NULL);
return tv.tv_sec + tv.tv_usec/1000000.0;
}

heat_mpi.c
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <mpi.h>
#include <math.h>
#include <time.h>

#define IMAX (512)

#define JMAX (512)
#define KMAX (512)
#define TMAX (500)

#define T(i,j,k,b) \
temp[(b)*IMAX*JMAX*KMAX+(k)*JMAX*IMAX+(j)*IMAX+(i)]

static void initmat(); /* initialize array */

static void print(); /* print one slice of 3d array */
static double gettime(); /* time measurement */

double* temp = NULL; /* temperature */

int main(int argc, char*** argv)

{
int i,j,k,t; /* loop indecies */
int rb = 0, wb = 1; /* for double-buffering */
double stime; /* start */
double etime; /* end */

/* domain distribution variable */

int id, p, mod, flr, asg, ini, end,
rc,master;
MPI_Status status;

/*MPI initialization*/
MPI_Init(&argc,argv);
MPI_Comm_rank(MPI_COMM_WORLD, &id);
MPI_Comm_size(MPI_COMM_WORLD, &p);

initmat(); /* setup the array - every process store whole array */

/*domain distribution*/
mod=(KMAX-2)%p;
flr = ((KMAX-2)-mod)/p;
asg = (id<mod)?flr+1:flr;
ini = (id<=mod)?1+id*(flr+1):1+id*flr+mod;
end = ini+asg;
master = (ini<=KMAX/3 && KMAX/3<end)?id:-1; /* master is variable which
stores T
in the domain KMAX/3 */

fprintf(stderr,"master:%d process %d start: %d\t end:%d\n",master,

id,ini,end);
MPI_Barrier(MPI_COMM_WORLD);
if(id==master){
stime = MPI_Wtime();}

/* kernel loop - each process only update their respective k_range */

for(t=0;t<TMAX;t++){
for(k=ini;k<end;k++){
for(j=1;j<JMAX-1;j++){
for(i=1;i<IMAX-1;i++){
T(i,j,k,wb) = (T(i,j,k,rb)
+ T(i,j,k-1,rb)
+ T(i,j,k+1,rb)
+ T(i,j-1,k,rb)
+ T(i,j+1,k,rb)
+ T(i-1,j,k,rb)
+ T(i+1,j,k,rb))/7;
}
}
}
MPI_Barrier(MPI_COMM_WORLD); /*wait until all processes are done*/

/communication to send value at the boundary and the interfaces/

for(j=1;j<JMAX-1;j++){
for(i=1;i<IMAX-1;i++){
if(id!=0 && id!=p-1){
rc =
MPI_Send(&T(i,j,ini,wb),1,MPI_DOUBLE,id-1,1,MPI_COMM_WORLD);
rc =
MPI_Send(&T(i,j,end-1,wb),1,MPI_DOUBLE,id+1,2,MPI_COMM_WORLD);
rc =
MPI_Recv(&T(i,j,ini-1,wb),1,MPI_DOUBLE,id-1,2,MPI_COMM_WORLD,&status);
rc =
MPI_Recv(&T(i,j,end,wb),1,MPI_DOUBLE,id+1,1,MPI_COMM_WORLD,&status);
}
else if(id==0){
rc =
MPI_Send(&T(i,j,end-1,wb),1,MPI_DOUBLE,id+1,2,MPI_COMM_WORLD);
rc =
MPI_Recv(&T(i,j,end,wb),1,MPI_DOUBLE,id+1,1,MPI_COMM_WORLD,&status);
}
else if(id==p-1){
rc =
MPI_Send(&T(i,j,ini,wb),1,MPI_DOUBLE,id-1,1,MPI_COMM_WORLD);
rc =
MPI_Recv(&T(i,j,ini-1,wb),1,MPI_DOUBLE,id-1,2,MPI_COMM_WORLD,&status);
}
}
}

MPI_Barrier(MPI_COMM_WORLD);

rb = (rb==0?1:0);
wb = (wb==0?1:0); /* wb != rb */
if(id==master){
fprintf(stderr,"%d\n",t);}

MPI_Barrier(MPI_COMM_WORLD);

if(id==master){
etime = MPI_Wtime();
fprintf(stderr,"elapsed: %lf [sec]\n",etime-stime);
print(); /* print out the result by master*/
}
MPI_Finalize();
return 0;
}

void initmat(void)
{
int i,j,k;
temp = (double*)malloc(IMAX*JMAX*KMAX*2*sizeof(double));

fprintf(out_file,"P2\n%d %d 255\n",IMAX,JMAX);
for(j=0;j<JMAX;j++)
for(i=0;i<IMAX;i++)
fprintf(out_file,"%d ",(unsigned char)T(i,j,KMAX/3,0));
}

double gettime()
{
struct timeval tv;
gettimeofday(&tv,NULL);
return tv.tv_sec + tv.tv_usec/1000000.0;
}

Industrial Engineering and Management by Pravin Kumar
100% (10)
Industrial Engineering and Management by Pravin Kumar
673 pages
Parallel Computing Lab Manual
No ratings yet
Parallel Computing Lab Manual
26 pages
Nursing Informatics
89% (9)
Nursing Informatics
43 pages
EI8751-Industrial Data Networks
No ratings yet
EI8751-Industrial Data Networks
10 pages
Jawaban MTCNA
No ratings yet
Jawaban MTCNA
13 pages
51 Cutover Templates
100% (2)
51 Cutover Templates
13 pages
Discover Haxeflixel Full
100% (3)
Discover Haxeflixel Full
182 pages
Java Past Paper
No ratings yet
Java Past Paper
3 pages
MPP Exercises
No ratings yet
MPP Exercises
8 pages
MPI Parallel Programming Guide
No ratings yet
MPI Parallel Programming Guide
67 pages
PlayStation Vita's First Year
50% (2)
PlayStation Vita's First Year
33 pages
Rest-Assured Rest
No ratings yet
Rest-Assured Rest
17 pages
PC Record 15
No ratings yet
PC Record 15
81 pages
10 MPI Programmes
No ratings yet
10 MPI Programmes
26 pages
PC Manual
No ratings yet
PC Manual
9 pages
WWW - Countrycrossstitchkits.Co - Uk 900-01-14 - "Just For You"
No ratings yet
WWW - Countrycrossstitchkits.Co - Uk 900-01-14 - "Just For You"
7 pages
Parallel Pi Calculation Methods
No ratings yet
Parallel Pi Calculation Methods
8 pages
MC Openmp
No ratings yet
MC Openmp
10 pages
Mid # 2 Solution
No ratings yet
Mid # 2 Solution
4 pages
HPC MPI LAB 1 Vector Addition
No ratings yet
HPC MPI LAB 1 Vector Addition
9 pages
Link For Video: 2f548388ec255440535e897?sid 9936f6 B2-C57d-49de-8124-3bb0e1a4e612
No ratings yet
Link For Video: 2f548388ec255440535e897?sid 9936f6 B2-C57d-49de-8124-3bb0e1a4e612
11 pages
Machine Learning Lab Guide
No ratings yet
Machine Learning Lab Guide
69 pages
RajSingh HPC Exp1-7
No ratings yet
RajSingh HPC Exp1-7
23 pages
CP4292 Mcap
No ratings yet
CP4292 Mcap
15 pages
Untitled Document
No ratings yet
Untitled Document
23 pages
Gauravkumar 221it027@it301 Lab2
No ratings yet
Gauravkumar 221it027@it301 Lab2
28 pages
PDCAssignment 04
No ratings yet
PDCAssignment 04
4 pages
Exercise - 4
No ratings yet
Exercise - 4
8 pages
HW 3
No ratings yet
HW 3
5 pages
Map55611 1 2
No ratings yet
Map55611 1 2
6 pages
Assignment 1, CS633: Code Explanation
No ratings yet
Assignment 1, CS633: Code Explanation
4 pages
2 Mpi
No ratings yet
2 Mpi
13 pages
Lab3
No ratings yet
Lab3
4 pages
Parallel MCQs Part 4
No ratings yet
Parallel MCQs Part 4
6 pages
OpenMP and MPI Parallel Programming Guide
No ratings yet
OpenMP and MPI Parallel Programming Guide
23 pages
MPI Python Workshop Day1 Fall2024
No ratings yet
MPI Python Workshop Day1 Fall2024
22 pages
Assignment 4
No ratings yet
Assignment 4
2 pages
Final PDC Exam
No ratings yet
Final PDC Exam
10 pages
Practice OpenMP
No ratings yet
Practice OpenMP
2 pages
Assignment 2 Cluster Computing
No ratings yet
Assignment 2 Cluster Computing
3 pages
Map55612 1
No ratings yet
Map55612 1
10 pages
RajSingh HPCexp5
No ratings yet
RajSingh HPCexp5
3 pages
Ee8218 Lab2
No ratings yet
Ee8218 Lab2
7 pages
Lecture07 MPI by Example
No ratings yet
Lecture07 MPI by Example
27 pages
Week 6 10
No ratings yet
Week 6 10
43 pages
Reporte 2
No ratings yet
Reporte 2
5 pages
EV Charger Specification
No ratings yet
EV Charger Specification
9 pages
Mpi Assignment
No ratings yet
Mpi Assignment
7 pages
Week 6 10
No ratings yet
Week 6 10
44 pages
Problemes MPI
No ratings yet
Problemes MPI
4 pages
LCR Measurements
No ratings yet
LCR Measurements
16 pages
As 3
No ratings yet
As 3
2 pages
MPI Lab 3
No ratings yet
MPI Lab 3
18 pages
Bac1105 Bisf1105 Bsd1106 Installation and Customization
No ratings yet
Bac1105 Bisf1105 Bsd1106 Installation and Customization
3 pages
CSE4001 Parallel and Distributed Computing: Lab Assignment 6
No ratings yet
CSE4001 Parallel and Distributed Computing: Lab Assignment 6
8 pages
Solutions Midterm 1 March 72020
No ratings yet
Solutions Midterm 1 March 72020
7 pages
Pseudo Code of Mpi Programs
No ratings yet
Pseudo Code of Mpi Programs
22 pages
HPC La2-2023qs
No ratings yet
HPC La2-2023qs
5 pages
Bolt - New Technical Implementation Explained
No ratings yet
Bolt - New Technical Implementation Explained
12 pages
Digital Lab Assignment-4 Parallel Distribution and Computing
No ratings yet
Digital Lab Assignment-4 Parallel Distribution and Computing
13 pages
Enhancement of An Encryption System Performance Using MPI
No ratings yet
Enhancement of An Encryption System Performance Using MPI
5 pages
The Purpose of XML Schema
No ratings yet
The Purpose of XML Schema
12 pages
Boost OEE with TPM and Pareto Analysis
No ratings yet
Boost OEE with TPM and Pareto Analysis
15 pages
Engineering Lab Report Analysis
No ratings yet
Engineering Lab Report Analysis
16 pages
Structure of A MPI Program
No ratings yet
Structure of A MPI Program
26 pages
Crash 2021 01 23 - 14.57.33 Client
No ratings yet
Crash 2021 01 23 - 14.57.33 Client
5 pages
MPI2
No ratings yet
MPI2
3 pages
CSC Examination Result
No ratings yet
CSC Examination Result
2 pages
Case Study 1
No ratings yet
Case Study 1
12 pages
Mpi Openmp Examples
No ratings yet
Mpi Openmp Examples
27 pages
Calculating Mpi Pi
No ratings yet
Calculating Mpi Pi
13 pages
VL2020210104311 Fat PDF
No ratings yet
VL2020210104311 Fat PDF
6 pages
Research Paper
No ratings yet
Research Paper
5 pages
HPC Project Mpi
No ratings yet
HPC Project Mpi
17 pages
MPI General CC
No ratings yet
MPI General CC
21 pages
Skylon (Album)
No ratings yet
Skylon (Album)
4 pages
ECEN3250 Lab 7: Design of Common-Source MOS Amplifiers Prelab Assignment
No ratings yet
ECEN3250 Lab 7: Design of Common-Source MOS Amplifiers Prelab Assignment
14 pages
Disco Externo Iomega Datasheet
No ratings yet
Disco Externo Iomega Datasheet
2 pages
Unit 1 DBMS
No ratings yet
Unit 1 DBMS
107 pages
Code: First Method:: (1) Write A C Program Using Open MP To Estimate The Value of PI (Use Minimum Two Methods)
No ratings yet
Code: First Method:: (1) Write A C Program Using Open MP To Estimate The Value of PI (Use Minimum Two Methods)
8 pages
NSi AutoStore InstallGuide en
No ratings yet
NSi AutoStore InstallGuide en
49 pages
DIPS v7 Rosette Plot Manual
No ratings yet
DIPS v7 Rosette Plot Manual
20 pages
Confirmatory Composite Analysis Guide
No ratings yet
Confirmatory Composite Analysis Guide
10 pages
WM 2024
No ratings yet
WM 2024
6 pages
Assigning A Sound File To An Instance. Assigning A Keyboard Key To An Instance. Assigning An Image File To An Instance. All of The Above. ( )
No ratings yet
Assigning A Sound File To An Instance. Assigning A Keyboard Key To An Instance. Assigning An Image File To An Instance. All of The Above. ( )
4 pages
Maxillary Teeth Esthetic Proportions
No ratings yet
Maxillary Teeth Esthetic Proportions
5 pages
Lab 11
No ratings yet
Lab 11
2 pages
Intro To MPI
No ratings yet
Intro To MPI
44 pages

Assignment (T)

Uploaded by

Assignment (T)

Uploaded by

High Performance Compu ng

Number of Execu on me (secs) Speedup E ciency

Figure 1. E ciency vs number of Processes of Pi_MPI.

The OpenMP code is shown on the next page tled “heat_omp.c”.

Figure 4. Schema c of the communica on

Figure 5. MPI result

Number of Execu on me (secs) Speedup E ciency

E ciency vs Number of Processes of Heat_MPI

Figure 6. E ciency vs Number of Processes of Heat_MPI

#define IMAX (512)

static void initmat(); /* initialize array */

double* temp = NULL; /* temperature */

int main(int argc, char* argv)

initmat(); /* setup the array */

print(); /* print out the result */

#define IMAX (512)

static void initmat(); /* initialize array */

double* temp = NULL; /* temperature */

int main(int argc, char*** argv)

/* domain distribution variable */

initmat(); /* setup the array - every process store whole array */

fprintf(stderr,"master:%d process %d start: %d\t end:%d\n",master,

/* kernel loop - each process only update their respective k_range */

/*communication to send value at the boundary and the interfaces*/

You might also like

/communication to send value at the boundary and the interfaces/