Digital Assignment 3 - Implementation report
Course Code and Name
BCSE205L&Computer Architecture and
Organization
Slot E1
Register Number of the 23BPS1012
Student
Name of the Student Sakshi Bansal
Title of the Paper Implementing Parallel Computing Using OpenMP
References (book/journal 1.
paper/web) https://www.geeksforgeeks.org/introduction-to-parallel
-programming-with-openmp-in-cpp/
2. https://wgropp.cs.illinois.edu/bib/talks/tdata/2004/
mpi-half-day-public.pdf
3. https://github.com/CisMine/Parallel-Computing-C
uda-C
4. https://en.wikipedia.org/wiki/Parallel_computing
5. https://www.indeed.com/career-advice/career-dev
elopment/parallel-programming
6. https://en.wikipedia.org/wiki/OpenMP
LINK FOR VIDEO:
https://www.loom.com/share/b9856c6e1
2f548388ec255440535e897?sid=9936f6
b2-c57d-49de-8124-3bb0e1a4e612
Implementing Parallelism
with OpenMP: A Detailed
Performance Evaluation
Abstract
In an era where the limits of sequential computing
are becoming increasingly apparent, parallel
computing provides a path to achieve enhanced
computational performance. This paper explores
the use of OpenMP, a shared-memory parallel
programming model, for implementing parallelism
in C. A special focus is placed on the calculation of
Pi using numerical integration as a representative
example to illustrate OpenMP's ease of use and
performance benefits. Key metrics such as
speedup, scalability, resource utilization, and
overhead are analyzed to evaluate the
effectiveness of parallelism through OpenMP.
Problem Statement
Traditional computing models execute instructions
sequentially, limiting the speed at which computations
can be performed. As modern processors come
equipped with multiple cores, it becomes essential to
utilize these resources effectively. Parallel computing
distributes tasks across multiple cores to improve
performance, making it suitable for tasks involving large
computations.
The problem addressed in this paper is the inefficient
computation of Pi using numerical integration in a serial
implementation. As the number of intervals increases for
higher precision, the serial approach becomes
time-consuming. This calls for a parallel solution using
OpenMP to achieve faster and more efficient
computation.
Implementation
We use the following mathematical formulation to
approximate Pi:
π=∫0141+x2dx\pi = \int_0^1 \frac{4}{1 + x^2} dx
This integral is evaluated using numerical integration by
dividing the range into many small intervals. The
algorithm in C is enhanced with OpenMP directives to
parallelize the computation.
#include <stdio.h>
#include <omp.h>
#include <math.h>
int main() {
int n = 100000000; // Number of intervals
double h = 1.0 / n;
double sum = 0.0;
double pi;
double start_time, end_time;
start_time = omp_get_wtime();
#pragma omp parallel
{
double x;
double local_sum = 0.0;
#pragma omp for
for (int i = 0; i < n; i++) {
x = (i + 0.5) * h;
local_sum += 4.0 / (1.0 + x * x);
}
#pragma omp critical
sum += local_sum;
}
pi = h * sum;
end_time = omp_get_wtime();
printf("Pi approximation: %.16f\n", pi);
printf("Exact Pi: %.16f\n", M_PI);
printf("Error: %.16e\n", fabs(pi - M_PI));
printf("Execution time: %f seconds\n", end_time -
start_time);
return 0;
}
Explanation of Key Directives:
● #pragma omp parallel creates a team of threads.
● #pragma omp for distributes loop iterations.
● #pragma omp critical ensures that only one thread
updates the shared variable at a time.
Each thread calculates a partial sum, which is combined
at the end to compute the final value of Pi.
Results
To evaluate performance, the Pi approximation was run
using various thread counts on an 8-core system. The
results below show execution times and speedups
achieved with parallel implementation.
Execution Time and Speedup:
Number of Execution Time Speedu
Threads (seconds) p
1 (Serial) 1.874 1.00
2 0.962 1.95
4 0.489 3.83
8 0.258 7.26
16 0.253 7.41
The near-linear speedup observed up to 8 threads
demonstrates efficient use of multi-core architecture.
Beyond this point, performance gains plateau due to
thread management overhead and CPU limitations.
Graphs/Plots/Tables
The following visualizations help better understand the
performance dynamics:
1. Execution Time vs. Number of Threads
o Demonstrates how execution time decreases
as the number of threads increases.
2. Speedup vs. Number of Threads
o Shows speedup trend compared to serial
execution.
import matplotlib.pyplot as plt
import numpy as np
threads = np.array([1, 2, 4, 8, 16])
time = np.array([1.874, 0.962, 0.489, 0.258, 0.253])
speedup = np.array([1.0, 1.95, 3.83, 7.26, 7.41])
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
ax1.plot(threads, time, marker='o', color='blue')
ax1.set_title('Execution Time vs Threads')
ax1.set_xlabel('Threads')
ax1.set_ylabel('Time (seconds)')
ax1.grid(True)
ax2.plot(threads, speedup, marker='o', color='green')
ax2.set_title('Speedup vs Threads')
ax2.set_xlabel('Threads')
ax2.set_ylabel('Speedup')
ax2.grid(True)
plt.tight_layout()
plt.show()
Conclusion
This paper demonstrates how parallel computing with
OpenMP can dramatically enhance computational
performance, even with minimal changes to a serial
codebase. By leveraging multi-core CPUs effectively,
the Pi calculation saw a significant reduction in
execution time and impressive speedup metrics.
Key takeaways include:
● Optimal performance was observed when thread
count matched physical cores.
● The overhead of synchronization and context
switching limits further speedup.
● OpenMP is highly suitable for parallelizing tasks
with independent computations like numerical
integration.
As computation continues to shift towards multi-core
architectures, mastering tools like OpenMP becomes
essential for efficient software development in science,
engineering, and data-heavy fields.
References
1. https://gribblelab.org/teaching/CBootCamp/A2_Parallel_Programming_in_C.ht
ml
2. https://rci.stonybrook.edu/hpc/faqs/a-guide-to-using-openmp-and-mpi-on-sea
wulf
3. https://en.wikipedia.org/wiki/OpenMP
4. https://www.geeksforgeeks.org/introduction-to-parallel-programming-with-ope
nmp-in-cpp/
5. https://www.openmp.org/
6. https://docs.nvidia.com/cuda/cuda-c-programming-guide/
7. https://wgropp.cs.illinois.edu/bib/talks/tdata/2004/mpi-half-day-public.pdf
8. https://web.engr.oregonstate.edu/~mjb/cs575/Handouts/openmp.1pp.pdf
9. https://developer.nvidia.com/cuda-