0% found this document useful (0 votes)

77 views8 pages

Report Homework 1: 1 Openmp Experiment

The document summarizes the results of OpenMP experiments using different pragma directives (for, sections, single) and array sizes (100, 10000, 10000000, 100000000 elements). The for directive provided significant performance improvements for large arrays with over 100000000 elements, but no improvement or slower performance for smaller arrays compared to sequential execution. The sections and single directives showed no performance benefits since the code only had one section. Variables need to be specified as private to avoid bugs from thread interference.

Uploaded by

Eko Rudiawan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

77 views8 pages

Report Homework 1: 1 Openmp Experiment

Uploaded by

Eko Rudiawan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Report Homework 1

Eko Rudiawan Jamzuri, 60775041H

March 15, 2019

1 OpenMP Experiment
This section will describe about OpenMP experiment in exercise.

1.1 OpenMP Pragma Completion 1

1.1.1 Source Code
The experiment use different pragma directive to evaluate total execution time of the program.
There are three different pragmas that I use in this experiment, that are for, sections, and single
directive. The experiment also evaluate execution speed performance by changing array size. Each
experiment take five times program execution. Total execution time of the program will be recorded
and average execution time will be calculated manually. The overview of pragma source code is
listed below. P1, P2, and P3 in Table 1 are reference to pragma code that listed below.

1. P1 : Use for directive

1 # pragma omp p a r a l l e l
2 {
3 # pragma omp f o r
4 f o r ( i = 0 ; i <N ; i ++) {
5 CC[ i ] = A[ i ] + B [ i ] ;
6 }
7 }

2. P2 : Use sections directive

1 # pragma omp p a r a l l e l
2 {
3 # pragma omp s e c t i o n s
4 {
5 # pragma omp s e c t i o n
6 f o r ( i = 0 ; i <N ; i ++) {
7 CC[ i ] = A[ i ] + B [ i ] ;
8 }
9 }
10 }

3. P3 : Use single directive

1
1 # pragma omp p a r a l l e l
2 {
3 # pragma omp s i n g l e
4 f o r ( i = 0 ; i <N ; i ++) {
5 CC[ i ] = A[ i ] + B [ i ] ;
6 }
7 }

1.1.2 Result
Result of comparison of three different pragma directives listed in Table 1, Table 2, Table 3, and
Table 4 below. Table 1 used 100 elements array, Table 2 used 10000 elements array, Table 3 used
10000000 elements array, and Table 4 used 100000000 elements array.

Table 1: Result of execution time with N=100

Num Seq (us) P1 (ms) P2 (ms) P3 (ms)
1 0.91 12.91 4.90 7.21
2 0.93 5.02 6.73 5.22
3 0.45 7.81 6.43 5.63
4 0.96 7.17 6.75 4.70
5 0.94 6.04 8.90 5.93
Avg 0.84 7.79 6.74 5.74

Table 2: Result of execution time with N=10000

Num Seq (us) P1 (ms) P2 (ms) P3 (ms)
1 99.24 5.98 7.98 3.04
2 48.03 6.41 7.53 5.18
3 80.12 4.05 6.38 8.39
4 99.80 8.28 6.94 7.09
5 102.40 5.42 4.65 6.91
Avg 85.92 6.03 6.70 6.12

Table 3: Result of execution time with N=1000000

Num Seq (ms) P1 (ms) P2 (ms) P3 (ms)
1 5.15 6.58 16.67 14.95
2 5.15 5.95 15.41 11.54
3 5.17 8.33 14.30 13.60
4 5.18 9.19 17.52 15.45
5 5.17 7.31 16.11 16.22
Avg 5.16 7.47 16.00 14.35

Based on this experiment, the pragma that using for directive will give significant performance
when it use for big element array (more than 100000000 elements). The execution time will be

2
Table 4: Result of execution time with N=100000000
Num Seq (ms) P1 (ms) P2 (ms) P3 (ms)
1 510.64 149.16 781.36 772.94
2 511.37 160.01 773.92 776.86
3 511.69 164.86 782.42 786.86
4 511.15 155.05 777.22 776.66
5 512.97 159.05 775.20 778.52
Avg 511.56 157.63 778.02 778.37

faster compare to sequential program. But, when for directive is use to optimize small element
array, the performance is lower compare to sequential program. Both sections and single directives
have no effect to optimize that code, because in that code sections directive only have one section
block that means the code will be executed in single thread same as when using single directive.
That why the performance when using those directive is not better compare to when sequential
program.

1.2 OpenMP Pragma Completion 2

1.2.1 Source Code
The experiment method same as the first experiment, that is by changing pragma directive and array
size and then measure the execution time.
1. P1 : Use for directive
1 # pragma omp p a r a l l e l
2 {
3 # pragma omp f o r
4 f o r ( i = 0 ; i <N ; i ++) {
5 CC[ i ] = A[ i ] + B [ i ] ;
6 }
7 # pragma omp f o r
8 f o r ( i = 0 ; i <N ; i ++) {
9 p a r a l l e l S u m += CC[ i ] ;
10 }
11 }

2. P2 : Use sections directive

1 # pragma omp p a r a l l e l
2 {
3 # pragma omp s e c t i o n s
4 {
5 # pragma omp s e c t i o n
6 f o r ( i = 0 ; i <N ; i ++) {
7 CC[ i ] = A[ i ] + B [ i ] ;
8 }
9 # pragma omp s e c t i o n
10 f o r ( i = 0 ; i <N ; i ++) {
11 p a r a l l e l S u m += CC[ i ] ;
12 }
13 }
14 }

3
3. P3 : Use single directive
1 # pragma omp p a r a l l e l
2 {
3 # pragma omp s i n g l e
4 f o r ( i = 0 ; i <N ; i ++) {
5 CC[ i ] = A[ i ] + B [ i ] ;
6 }
7 # pragma omp s i n g l e
8 f o r ( i = 0 ; i <N ; i ++) {
9 p a r a l l e l S u m += CC[ i ] ;
10 }
11 }

1.2.2 Result
The experiment resulting similar result with the first experiment. For directive will have significant
impact when it use to calculate a big array. For directive has no better performance when it use to
calculate small size array compare to sequential program. Overview of the experiment can be seen
in Table 5, Table 6, and Table 7.

Table 5: Result of execution time with N=1000

Num Seq (us) P1 (ms) P2 (ms) P3 (ms)
1 5.39 7.91 4.58 7.81
2 11.20 10.17 3.94 6.70
3 5.36 9.30 6.88 10.87
4 11.17 8.84 5.47 5.43
5 5.39 4.05 8.28 6.18
Avg 7.70 8.05 5.83 7.40

Table 6: Result of execution time with N=1000000

Num Seq (ms) P1 (ms) P2 (ms) P3 (ms)
1 6.77 10.31 28.85 22.55
2 6.79 7.48 28.82 19.65
3 6.82 6.35 27.78 21.09
4 6.80 10.67 10.51 15.11
5 6.79 9.88 12.68 18.02
Avg 6.79 8.94 1.732 19.28

1.3 Bug Finding 1

The source code will have a bug because by default all variables inside pragma are shared between
threads. In this case, temp variable will be accessed by all thread and the variable value is very
possible to be overwrite by another thread. To avoid bug in the source code, need to modify pragma
to make variable temp to be a private variable bay adding private code at the end of pragma. The
code below show how to make variable temp to be private.

4
Table 7: Result of execution time with N=10000000
Num Seq (ms) P1 (ms) P2 (ms) P3 (ms)
1 67.40 44.59 285.56 124.78
2 67.36 27.64 291.63 124.02
3 67.50 35.52 254.09 136.22
4 67.42 39.22 270.00 120.61
5 67.45 31.18 254.15 126.13
Avg 67.43 35.63 271.09 126.35

1 # pragma omp p a r a l l e l f o r p r i v a t e ( temp )

2 f o r ( i = 0 ; i <N ; i ++) {
3 temp = AA[ i ] ;
4 AA[ i ] = BB[ i ] ;
5 BB[ i ] = temp ;
6 }

1.4 Bug Finding 2

The problem with the code is when it executed in parallel, x value will be shared with another
thread. Then we cannot make sure x value, it’s depend on the last thread execution. To avoid bug
in the source code, need to set pragma to be critical above increment code. That critical pragma
identifies a section of code must be executed by a single thread at a time. So if we have 8 threads,
value of x will be 8 because each thread will execute this code one times.
1 # pragma omp p a r a l l e l s h a r e d ( x )
2 {
3 # pragma omp c r i t i c a l
4 x = x + 1;
5 }

1.5 OpenMP Pragma Completion 3

To execute two sections in parallel we need to declare pragma sections. Then followed by section
block. Section means that code will be executed by single thread. So, if there are two section blocks
OpenMP will use two threads to execute those two section blocks.
1 # pragma omp p a r a l l e l
2 {
3 # pragma omp s e c t i o n s
4 {
5 # pragma omp s e c t i o n
6 f o r ( i = 0 ; i < N ; i ++)
7 c[ i ] = a[ i ] + b[ i ];
8 # pragma omp s e c t i o n
9 f o r ( i = 0 ; i < N ; i ++)
10 d[ i ] = a[ i ] ∗ b[ i ];
11 } / ∗ end o f s e c t i o n s ∗ /
12 } / ∗ end o f p a r a l l e l r e g i o n ∗ /

5
2 Exercise 1: Matrix Multiplication Optimization
This section will describe about experiment of matrix multiplication.

2.1 Serialize Matrix Multiplication

The first experiment of matrix multiplication is implementing standard nested for-loop calculation
for each row and column. Evaluation of algorithm performance by changing matrix size and mea-
sure the execution time of matrix multiplication program. The experiment resulting data that listed
in Table 8. The more bigger matrix size, the more longer execution time.

Table 8: Serialize Matrix Multiplication Performance

Num N=100 (ms) N=300 (ms) N=700 (ms)
1 6.80 130.32 1851.77
2 7.01 130.97 1870.68
3 11.52 134.08 1852.36
4 11.91 130.13 1852.32
5 11.15 130.75 1855.69
Avg 9.68 131.25 1856.56

2.2 Serialize Matrix Multiplication with Transpose

The second experiment of matrix multiplication is implementing transpose method. The implemen-
tation of transpose method have a better performance compare to standard matrix multiplication.
Table 9 show a result of the experiment.

Table 9: Serialize Matrix Multiplication With Transpose Performance

Num N=100 (ms) N=300 (ms) N=700 (ms)
1 7.06 120.99 1529.96
2 6.58 121.08 1530.14
3 7.14 120.97 1529.81
4 6.33 121.09 1530.13
5 8.12 121.02 1531.94
Avg 7.07 121.03 1590.40

2.3 Parallel Matrix Multiplication

The third experiment of matrix multiplication is implementing parallel nested for-loop to speed-up
execution time of the first experiment. The optimization of this experiment using for-loop directive
to make for-loop code is executed in parallel. After optimization with parallel execution, the ex-
ecution time reduce significantly compared to original code. The result of this experiment shown
in Table 10. Speed-up factor when the code is executed in parallel mode can be calculated by
dividing non-optimize execution time with parallel execution time. Based on the comparison, the

6
Table 10: Parallel Matrix Multiplication Performance
Num N=100 (ms) N=300 (ms) N=700 (ms)
1 8.52 61.03 713.06
2 9.92 59.65 710.79
3 10.36 65.45 726.01
4 6.14 58.17 736.89
5 5.77 53.07 704.14
Avg 8.14 59.47 718.18

speed-up factor rise 2.58 times when the code is used to calculate 700x700 matrix size. The detail
of speed-up factor value can be seen in Table 11.

Table 11: Speed Factor Standard Matrix Multiplication Using Parallel Computation
Matrix Size Non Optimize (ms) Optimize (ms) Speed-Up Factor (X)
100x100 9.68 8.14 1.19
300x300 131.25 59.47 2.21
700x700 1856.56 718.18 2.58

2.4 Parallel Matrix Multiplication with Transpose

The fourth experiment is implementing parallel optimization to matrix multiplication with trans-
pose method. Same as the third experiment, the code will be optimize to rise up for-loop execution
timing using for directives. After trying several experiment with different size of matrix, the exe-
cution time of this method is the best to standard matrix multiplication, standard transpose method,
and optimize standard matrix multiplication. The result of this experiment listed in Table 11

Table 12: Parallel Matrix Multiplication With Transpose Performance

Num N=100 (ms) N=300 (ms) N=700 (ms)
1 12.81 120.60 678.88
2 11.78 126.33 679.04
3 13.89 123.17 679.55
4 15.09 63.89 681.10
5 13.34 122.50 680.53
Avg 13.38 111.20 679.82

The speed-up factor of this method shown in Table 13

2.5 Overall Matrix Multiplication Performance

Overall performance of matrix multiplication method shown in Table 14. By the evaluation, par-
allel matrix multiplication using transpose method is the best method. When using this method
to calculate matrix multiplication with big size 700x700, it needs only 679.82 ms execution time.

7
Table 13: Speed Factor Transpose Matrix Multiplication Using Parallel Computation
Matrix Size Non Optimize (ms) Optimize (ms) Speed-Up Factor (X)
100x100 7.07 13.38 0.1
300x300 121.03 111.20 1.06
700x700 1590.40 679.82 2.34

In the other hand, when calculating small size matrix multiplication (100x100), the non-optimize
transpose method is giving best performance compare to other method. It’s only need 7.07 ms
to multiply the matrix. This experiment conclude that parallel optimization will give a significant
impact when it is used to calculate a big matrix or to execute a big for-loop iteration.

Table 14: Overall Matrix Multiplication Performance

Size MatMul (ms) MatMul Trans (ms) Parallel MatMul (ms) Parallel MatMul Trans (ms)
100 9.68 7.07 8.14 13.38
300 131.25 121.03 59.47 111.20
700 1856.56 1590.40 718.18 679.82

Parallel Computing Lab Manual
No ratings yet
Parallel Computing Lab Manual
26 pages
Kakuro Cheat Sheet
100% (1)
Kakuro Cheat Sheet
1 page
Parallel Computing Lab Manual PDF
100% (1)
Parallel Computing Lab Manual PDF
51 pages
AI Fundamentals Level 1 Quiz - Attempt Review
No ratings yet
AI Fundamentals Level 1 Quiz - Attempt Review
9 pages
OpenMP Programs
No ratings yet
OpenMP Programs
4 pages
Viva Questions
No ratings yet
Viva Questions
15 pages
Presentation On IPL Match Winner Prediction With ML
No ratings yet
Presentation On IPL Match Winner Prediction With ML
27 pages
Parallel Computing Manual
No ratings yet
Parallel Computing Manual
15 pages
Lab Pratice First Lab Manual
No ratings yet
Lab Pratice First Lab Manual
81 pages
Excelente
No ratings yet
Excelente
64 pages
Parallel Computing - 1-9-1
No ratings yet
Parallel Computing - 1-9-1
16 pages
OpenMP Shared
No ratings yet
OpenMP Shared
28 pages
Sample - Code - Parallel - Cse6230 Fa14 04 Omp
No ratings yet
Sample - Code - Parallel - Cse6230 Fa14 04 Omp
51 pages
1 The Black-Scholes Model
No ratings yet
1 The Black-Scholes Model
12 pages
Parallel Assignment 3
No ratings yet
Parallel Assignment 3
9 pages
Test Bank Questions Chapter 3 2019 2022
No ratings yet
Test Bank Questions Chapter 3 2019 2022
6 pages
CP4292 Multicore Architecture Lab Manual
No ratings yet
CP4292 Multicore Architecture Lab Manual
36 pages
Multi Core
No ratings yet
Multi Core
25 pages
HPC Lab Manual 2317 Merged Organized
No ratings yet
HPC Lab Manual 2317 Merged Organized
35 pages
Multicore Architecture Lab Manual
No ratings yet
Multicore Architecture Lab Manual
34 pages
Sample Thesis Using Regression Analysis
100% (5)
Sample Thesis Using Regression Analysis
6 pages
MPC LAB Manual New
No ratings yet
MPC LAB Manual New
24 pages
Assignment 5
No ratings yet
Assignment 5
6 pages
Cp4292 Multicore Lab Multicore Lab Removed
No ratings yet
Cp4292 Multicore Lab Multicore Lab Removed
37 pages
Java Practise Exercise
No ratings yet
Java Practise Exercise
3 pages
CP4292 Mcap
No ratings yet
CP4292 Mcap
24 pages
Parallel and Distributed Computing
No ratings yet
Parallel and Distributed Computing
5 pages
Gauravkumar 221it027@it301 Lab2
No ratings yet
Gauravkumar 221it027@it301 Lab2
28 pages
Simulating First Order Dynamical Systems Using Analog Computer
No ratings yet
Simulating First Order Dynamical Systems Using Analog Computer
12 pages
Chapter 18 Operations Management
No ratings yet
Chapter 18 Operations Management
4 pages
MAP Lab Completed
No ratings yet
MAP Lab Completed
29 pages
Program 4
No ratings yet
Program 4
3 pages
CP4252 Multicore Architecture and Programming Lab Manual
No ratings yet
CP4252 Multicore Architecture and Programming Lab Manual
26 pages
(Serial)
No ratings yet
(Serial)
8 pages
Question 1 - Serial: Output
No ratings yet
Question 1 - Serial: Output
9 pages
Mcap-Lab Manual 1
No ratings yet
Mcap-Lab Manual 1
19 pages
Openmp
No ratings yet
Openmp
115 pages
MAP Lab Mannual
No ratings yet
MAP Lab Mannual
24 pages
PDC-Lab 21BCE10419
No ratings yet
PDC-Lab 21BCE10419
20 pages
Lab Manual
No ratings yet
Lab Manual
31 pages
Vector Addition: Exercise 1 (Openmp-I) Scenario - I
100% (1)
Vector Addition: Exercise 1 (Openmp-I) Scenario - I
15 pages
Name: Harshvardhan Singh Gahlaut Reg. No.: 19BCE2372 Slot: L41+L42
No ratings yet
Name: Harshvardhan Singh Gahlaut Reg. No.: 19BCE2372 Slot: L41+L42
3 pages
Quiz
No ratings yet
Quiz
21 pages
XLSTMTime - Long-Term Time Series Forecasting With XLSTM
No ratings yet
XLSTMTime - Long-Term Time Series Forecasting With XLSTM
13 pages
Lab 3
No ratings yet
Lab 3
23 pages
CP 4292 MCP Lab Manual
No ratings yet
CP 4292 MCP Lab Manual
20 pages
Simple Multi Threader
No ratings yet
Simple Multi Threader
2 pages
Lab # 2 by Akram
No ratings yet
Lab # 2 by Akram
14 pages
OpenMP Parallel Programming Guide
No ratings yet
OpenMP Parallel Programming Guide
25 pages
HPC Codes-2
No ratings yet
HPC Codes-2
15 pages
OpenMP Programming Examples
No ratings yet
OpenMP Programming Examples
29 pages
SWE2017 - Lab Assignment 1pages-7
No ratings yet
SWE2017 - Lab Assignment 1pages-7
5 pages
OpenMP Matrix
No ratings yet
OpenMP Matrix
6 pages
Lab 2
No ratings yet
Lab 2
2 pages
As 3
No ratings yet
As 3
2 pages
Parallel Computing Lab Guide
No ratings yet
Parallel Computing Lab Guide
3 pages
OpenMP 2
No ratings yet
OpenMP 2
3 pages
E05 - 22cs4106R Lab WorkBook
No ratings yet
E05 - 22cs4106R Lab WorkBook
96 pages
Mid Sem QP&Solution
No ratings yet
Mid Sem QP&Solution
7 pages
20bce2126 PDC Lab Da 3
No ratings yet
20bce2126 PDC Lab Da 3
11 pages
OpenMP Matrix Operations Comparison
No ratings yet
OpenMP Matrix Operations Comparison
6 pages
HPC Programs
No ratings yet
HPC Programs
19 pages
Quantitative Methods For Business 13th Edition Anderson Test Bank Instant Download
100% (11)
Quantitative Methods For Business 13th Edition Anderson Test Bank Instant Download
64 pages
E 3 (Openmp - Iii) : Matrix Multiplication
No ratings yet
E 3 (Openmp - Iii) : Matrix Multiplication
10 pages
Pseudo Code of Mpi Programs
No ratings yet
Pseudo Code of Mpi Programs
22 pages
Project Assignment 3 Multi Processor System (DV 2544) : Susheel Sagar
No ratings yet
Project Assignment 3 Multi Processor System (DV 2544) : Susheel Sagar
4 pages
Chapter 5
No ratings yet
Chapter 5
92 pages
Deep Learning-Based Assessment Model For Real-Time Identification of Visual Learners Using Raw EEG
No ratings yet
Deep Learning-Based Assessment Model For Real-Time Identification of Visual Learners Using Raw EEG
13 pages
Robust Filtered Smith Predictor For Processes With Time - 2020 - European Journ
No ratings yet
Robust Filtered Smith Predictor For Processes With Time - 2020 - European Journ
13 pages
Try Again Once You Are Ready.: Quiz 3
No ratings yet
Try Again Once You Are Ready.: Quiz 3
8 pages
Q1 Mathematics10-Polynomials
No ratings yet
Q1 Mathematics10-Polynomials
8 pages
Parallel and Distributed Computing Lab Digital Assignment - 3
No ratings yet
Parallel and Distributed Computing Lab Digital Assignment - 3
10 pages
Machine Learning Wiki Overview
No ratings yet
Machine Learning Wiki Overview
2 pages
LINEAR REGRESSION Feu Diliman
No ratings yet
LINEAR REGRESSION Feu Diliman
11 pages
Cse3004 Design-Analysis-Of-Algorithm LT 1.0 1 Cse3004
No ratings yet
Cse3004 Design-Analysis-Of-Algorithm LT 1.0 1 Cse3004
2 pages
Renormalisation in Quantum Field Theory
No ratings yet
Renormalisation in Quantum Field Theory
127 pages
MQTT (Message Queue Telemetry Transport)
No ratings yet
MQTT (Message Queue Telemetry Transport)
23 pages
Data Structures Algorithms U5
No ratings yet
Data Structures Algorithms U5
83 pages
Random Matrix Theory: Wigner-Dyson Statistics and Beyond. Lecture Notes Given at SISSA (Trieste, Italy)
No ratings yet
Random Matrix Theory: Wigner-Dyson Statistics and Beyond. Lecture Notes Given at SISSA (Trieste, Italy)
29 pages
Inequalities and Wavy Curve Method
No ratings yet
Inequalities and Wavy Curve Method
51 pages
Factoring: Math 8 Teacher Jervy Josiah D. Bayang
No ratings yet
Factoring: Math 8 Teacher Jervy Josiah D. Bayang
23 pages
OpenMP Programming Exercises
No ratings yet
OpenMP Programming Exercises
10 pages
Perturbation Theory
No ratings yet
Perturbation Theory
6 pages
Ju DG-Recon Depth-Guided Neural 3D Scene Reconstruction ICCV 2023 Paper
No ratings yet
Ju DG-Recon Depth-Guided Neural 3D Scene Reconstruction ICCV 2023 Paper
11 pages
Symmetric vs Asymmetric Cryptography
No ratings yet
Symmetric vs Asymmetric Cryptography
9 pages
IJSRET V10 Issue2 195
No ratings yet
IJSRET V10 Issue2 195
6 pages
Quiz2 Fall22
No ratings yet
Quiz2 Fall22
4 pages
Try Again Once You Are Ready.: Quiz 3
No ratings yet
Try Again Once You Are Ready.: Quiz 3
8 pages
Mathematica for Grad Students
No ratings yet
Mathematica for Grad Students
8 pages
Report - Numerical Analysis
No ratings yet
Report - Numerical Analysis
7 pages
Immigrant Disease Testing Decision Analysis
No ratings yet
Immigrant Disease Testing Decision Analysis
4 pages
QB - TE5101 - Unit 4 PDF
No ratings yet
QB - TE5101 - Unit 4 PDF
1 page

Report Homework 1: 1 Openmp Experiment

Uploaded by

Report Homework 1: 1 Openmp Experiment

Uploaded by

Report Homework 1

Eko Rudiawan Jamzuri, 60775041H

1.1 OpenMP Pragma Completion 1

1. P1 : Use for directive

2. P2 : Use sections directive

3. P3 : Use single directive

Table 1: Result of execution time with N=100

Table 2: Result of execution time with N=10000

Table 3: Result of execution time with N=1000000

1.2 OpenMP Pragma Completion 2

2. P2 : Use sections directive

Table 5: Result of execution time with N=1000

Table 6: Result of execution time with N=1000000

1.3 Bug Finding 1

1 # pragma omp p a r a l l e l f o r p r i v a t e ( temp )

1.4 Bug Finding 2

1.5 OpenMP Pragma Completion 3

2.1 Serialize Matrix Multiplication

Table 8: Serialize Matrix Multiplication Performance

2.2 Serialize Matrix Multiplication with Transpose

Table 9: Serialize Matrix Multiplication With Transpose Performance

2.3 Parallel Matrix Multiplication

2.4 Parallel Matrix Multiplication with Transpose

Table 12: Parallel Matrix Multiplication With Transpose Performance

The speed-up factor of this method shown in Table 13

2.5 Overall Matrix Multiplication Performance

Table 14: Overall Matrix Multiplication Performance

You might also like