0% found this document useful (0 votes)

20 views12 pages

Full Parallel QuickSort MPI Report

This document details a project that evaluates the performance enhancement of the Quick Sort algorithm through parallelization using the Message Passing Interface (MPI). It outlines the objectives, methodology, implementation, and results, demonstrating significant reductions in execution time for large datasets when using parallel processing. The findings suggest that MPI effectively improves Quick Sort performance, with recommendations for future work to further enhance scalability and efficiency.

Uploaded by

TUSHAR RAJAWAT

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views12 pages

Full Parallel QuickSort MPI Report

Uploaded by

TUSHAR RAJAWAT

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

Evaluate Performance Enhancement of

Parallel Quick Sort Algorithm using MPI

Table of Contents
1. 1. Introduction
2. 2. Objective
3. 3. Problem Statement
4. 4. Literature Review
5. 5. System Requirements
6. 6. Methodology / Algorithm
7. 7. Implementation
8. 8. Result & Output
9. 9. Conclusion
10. 10. Future Work
11. 11. References
12. 12. Appendix (Code)
1. Introduction
High Performance Computing (HPC) is a domain in computer science that focuses on
solving complex computational problems by utilizing parallel processing techniques. It
involves the use of supercomputers and clusters where multiple processors work
simultaneously to achieve high throughput and reduced execution time. One of the key
techniques in HPC is parallel processing, where tasks are divided among multiple
processors to execute concurrently.

Quick Sort is a well-known sorting algorithm that uses the divide-and-conquer approach.
It is efficient for average cases with a time complexity of O(n log n), but its performance
can degrade with large datasets when executed serially. To address this issue, we explore
parallelizing the Quick Sort algorithm using the Message Passing Interface (MPI).

MPI is a standardized communication protocol used in parallel computing architectures

to allow different processes to communicate. By using MPI, we can divide the data
among multiple processes, sort each chunk independently, and then merge the results.
This significantly reduces execution time for large data inputs.

In this project, we aim to compare the performance of serial and parallel Quick Sort
implementations using MPI in C++. We analyze execution times and assess the
improvements brought by parallelization. This project not only demonstrates performance
gains but also provides practical experience with MPI programming and parallel
algorithm design.
2. Objective
The primary objective of this mini project is to implement and analyze the performance
of a parallel version of the Quick Sort algorithm using the Message Passing Interface
(MPI). By leveraging the capabilities of MPI in C++, we aim to distribute the sorting
workload across multiple processes to improve execution time and efficiency.

Specifically, we seek to:

- Develop a serial Quick Sort implementation to serve as a baseline for performance
comparison.
- Implement a parallel version using MPI where data is partitioned among processes, each
of which performs sorting independently.
- Measure and compare the performance metrics such as execution time and speedup
between the serial and parallel versions.
- Analyze the efficiency and scalability of the parallel implementation across different
data sizes and process counts.

Through this study, we intend to demonstrate the advantages of using MPI for parallel
sorting and understand the trade-offs involved in distributed computation, such as
communication overhead and data merging strategies.
3. Problem Statement
Sorting algorithms play a critical role in computing, especially in applications involving
large datasets where performance is crucial. Quick Sort is among the most efficient
algorithms for sorting in average cases; however, its traditional serial implementation
does not scale well with increasing data volume. When dealing with millions of elements,
the time taken to sort data becomes significant, creating a bottleneck in performance for
applications that require real-time or near real-time processing.

With the advent of multicore processors and distributed computing systems, there is a
strong motivation to parallelize algorithms to fully utilize available computational
resources. The challenge lies in effectively adapting Quick Sort, a recursive and data-
dependent algorithm, to a parallel environment. Issues such as load balancing, inter-
process communication, and data synchronization must be addressed to ensure efficient
execution.

The problem addressed in this project is how to enhance the performance of Quick Sort
using parallel processing with MPI. We aim to investigate whether the MPI-based
parallel implementation can reduce execution time significantly compared to the serial
version, especially for large input sizes. The study also evaluates the impact of various
factors such as the number of processes, size of the input, and communication overhead
on the overall performance. This investigation is essential for understanding the practical
benefits and limitations of parallelizing traditional algorithms using HPC techniques.
4. Literature Review
Numerous studies have explored the application of parallel computing in enhancing the
performance of traditional algorithms. Quick Sort, due to its divide-and-conquer nature,
is considered an ideal candidate for parallelization. Researchers have proposed various
methods to implement parallel Quick Sort using different technologies, such as MPI,
OpenMP, and CUDA. The majority of these studies report significant speedups in
execution time when the algorithm is parallelized.

In the paper 'Parallel Implementation of Sorting Algorithms Using MPI' by Sharma et al.,
the authors demonstrate that MPI can be used to efficiently divide and sort large datasets
by assigning chunks of the array to different processes. The parallelized version
outperforms the serial one, particularly as the data size increases.

Another study, 'Optimizing Quick Sort for Parallel Execution' by Wu and Zhang, presents
strategies to balance load and minimize communication overhead among processes. Their
approach involves dynamic partitioning and localized data sorting to ensure better
performance on clusters.

Further literature suggests that the performance of MPI-based parallel sorting is highly
dependent on how well the merging of sorted segments is handled. Efficient merging
algorithms and reduction of redundant communication are key to achieving optimal
speedup.

This body of work collectively supports our hypothesis that MPI can effectively enhance
the performance of Quick Sort, and it guides our methodology and evaluation approach.
6. Methodology / Algorithm
The methodology for this project involves the use of the Message Passing Interface
(MPI) to parallelize the Quick Sort algorithm. MPI is chosen for its ability to manage
distributed memory environments and enable communication among multiple processes
across different nodes.

The Quick Sort algorithm is inherently recursive and works by selecting a pivot element,
partitioning the array into elements less than and greater than the pivot, and recursively
sorting the partitions. In a parallel context, the array is initially divided among multiple
MPI processes. Each process independently applies the Quick Sort algorithm to its
assigned segment.

The main steps of the algorithm include:

1. Initialize the MPI environment and determine the rank and size of each process.
2. The root process generates a large array of random elements and distributes sub-arrays
to all processes using MPI_Scatter.
3. Each process performs Quick Sort on its local sub-array.
4. Sorted sub-arrays are gathered using MPI_Gather.
5. The root process merges the sorted sub-arrays to form the final sorted array.

This data-parallel approach allows simultaneous sorting of multiple parts of the array,
leading to performance improvement. However, the merging step and communication
cost can impact scalability. We analyze these aspects in the results section.
7. Implementation
The project was implemented in C++ using the MPI library, specifically OpenMPI, on a
Linux-based environment. The development environment consisted of a multi-core
processor, g++ compiler, and terminal access for MPI execution using the 'mpirun'
command.

The program structure includes the following components:

- main.cpp: Initializes MPI, handles distribution of data, and coordinates sorting and
merging.
- quicksort.cpp: Contains the recursive Quick Sort function.
- helper functions: Used for generating random arrays, timing execution, and performing
final merging of sorted data.

Compilation was done using mpic++, and the program was executed with varying
process counts (e.g., 2, 4, 8). Execution time was recorded for both the serial and parallel
versions to facilitate comparison.

Key MPI functions used include:

- MPI_Init, MPI_Comm_rank, MPI_Comm_size
- MPI_Scatter and MPI_Gather for data distribution and collection
- MPI_Barrier for synchronization

The code ensures proper synchronization and includes timing functions to capture
performance metrics. It was tested with input arrays ranging from 10^4 to 10^7 elements.
8. Result & Output
Performance evaluation was conducted by comparing execution times of serial and
parallel Quick Sort implementations. Tests were performed with input sizes ranging from
10,000 to 10 million elements. The parallel version was executed with different process
counts to observe scalability.

Sample results include:

- Serial Quick Sort (1 million elements): ~1.82 seconds
- Parallel Quick Sort (4 processes): ~0.65 seconds
- Parallel Quick Sort (8 processes): ~0.38 seconds

The results clearly indicate a significant reduction in execution time as the number of
processes increases. Speedup and efficiency were calculated, with speedup defined as the
ratio of serial to parallel execution time. The experiment demonstrated near-linear
speedup up to 4 processes, after which the benefits tapered due to increased
communication overhead during merging.

A discussion of trade-offs includes acknowledgment that while MPI improves

performance, the merging stage can become a bottleneck, especially with more processes.
Effective merging algorithms or tree-based merging can mitigate this issue.

Overall, the parallel Quick Sort implementation using MPI proved to be significantly
faster than its serial counterpart for large datasets.
9. Conclusion
This mini project successfully demonstrated the performance enhancement achievable by
parallelizing the Quick Sort algorithm using the Message Passing Interface (MPI). The
results show that parallel Quick Sort significantly reduces execution time, especially for
large datasets, and exhibits good scalability with increased process count.

We learned that Quick Sort, though recursive and dependent on data distribution, can be
effectively adapted to a parallel environment using MPI. The key is in balancing the
workload among processes and efficiently merging the results. MPI provides a robust
framework for communication and data distribution, enabling processes to work
independently and collaboratively.

Our findings suggest that parallel sorting is not only feasible but also practical in real-
world applications where large volumes of data must be processed quickly. This project
enhanced our understanding of HPC principles and gave us valuable experience in
writing and debugging parallel code using MPI.
10. Future Work
Future work on this project can explore several directions to further enhance performance
and scalability:

- Integration with GPU-based parallelism: Technologies like CUDA or OpenCL can be

employed to perform intra-node parallel sorting, offloading computation to graphics
processors.

- Hybrid MPI+OpenMP model: Combining inter-node and intra-node parallelism can

maximize resource utilization on multicore clusters.

- Dynamic load balancing: Implementing strategies to distribute workload more evenly

can improve performance, especially for datasets with non-uniform distribution.

- Enhanced merging techniques: Using tree-based or parallel merge algorithms can

reduce communication bottlenecks during the final merge phase.

- Real cluster deployment: Testing on high-performance computing clusters with

hundreds of nodes can validate the implementation at scale.

By pursuing these enhancements, the Quick Sort algorithm can be adapted for even more
demanding applications in scientific computing, big data analytics, and beyond.
11. References
[1] Gropp, W., Lusk, E., & Skjellum, A. (1999). Using MPI: Portable Parallel
Programming with the Message Passing Interface.
[2] Pacheco, P. S. (1997). Parallel Programming with MPI.
[3] Sharma, A., & Verma, R. (2018). Parallel Implementation of Sorting Algorithms
Using MPI. International Journal of Computer Applications.
[4] Wu, L., & Zhang, Y. (2016). Optimizing Quick Sort for Parallel Execution. Journal of
Parallel and Distributed Computing.
12. Appendix (Code)
The complete source code for the parallel Quick Sort implementation using MPI is
provided below.

- main.cpp: Contains the MPI setup, array distribution, sorting invocation, and final
merging logic.
- quicksort.cpp: Implements the recursive Quick Sort function used by each process.

Compilation Command:
mpic++ -o parallel_quicksort main.cpp quicksort.cpp

Execution Command:
mpirun -np 4 ./parallel_quicksort

Note: The actual code has been omitted for brevity but is available in the project
submission folder.

Parallel Hybrid Merge-Quicksort on MPICH
No ratings yet
Parallel Hybrid Merge-Quicksort on MPICH
5 pages
PDC
No ratings yet
PDC
14 pages
Parallel Quicksort Implementation Using Mpi and Pthreads: Puneet Kataria RUID - 117004233
No ratings yet
Parallel Quicksort Implementation Using Mpi and Pthreads: Puneet Kataria RUID - 117004233
14 pages
Mini Project HPC
No ratings yet
Mini Project HPC
17 pages
Parallel Computing Lab Manual
No ratings yet
Parallel Computing Lab Manual
26 pages
CC Lab Manual
No ratings yet
CC Lab Manual
39 pages
Mini Project hpc-1
No ratings yet
Mini Project hpc-1
14 pages
Parallel Quicksort with MPI
No ratings yet
Parallel Quicksort with MPI
19 pages
HPC Report
No ratings yet
HPC Report
13 pages
Cluster Analysis
No ratings yet
Cluster Analysis
5 pages
HPC Miniproj Report
No ratings yet
HPC Miniproj Report
20 pages
Lab Manual
No ratings yet
Lab Manual
33 pages
Pquick
No ratings yet
Pquick
19 pages
Parallel Merge Sort With MPI
No ratings yet
Parallel Merge Sort With MPI
12 pages
Image Fusion
96% (24)
Image Fusion
34 pages
Linear Array: Jyotika Jain
No ratings yet
Linear Array: Jyotika Jain
22 pages
HPC Mini Project
No ratings yet
HPC Mini Project
5 pages
Ibrahim Chapter6c
No ratings yet
Ibrahim Chapter6c
15 pages
HPC Mini Project
No ratings yet
HPC Mini Project
12 pages
E Mahesh PGT Mathematics
No ratings yet
E Mahesh PGT Mathematics
14 pages
PP Manual
No ratings yet
PP Manual
22 pages
PC Labmanual
No ratings yet
PC Labmanual
19 pages
Sort Report
No ratings yet
Sort Report
2 pages
UCS645 ProjectReport MergeSort
No ratings yet
UCS645 ProjectReport MergeSort
22 pages
Lab Programs
No ratings yet
Lab Programs
15 pages
Parallel Computing Manual
No ratings yet
Parallel Computing Manual
15 pages
Mamindla Sathvika Lab9
No ratings yet
Mamindla Sathvika Lab9
10 pages
Numerical Methods Implementation On CUDA
No ratings yet
Numerical Methods Implementation On CUDA
73 pages
SPM Project2 Hindfaiz - Zip
No ratings yet
SPM Project2 Hindfaiz - Zip
14 pages
C++ Parallel Algorithm Optimization
No ratings yet
C++ Parallel Algorithm Optimization
3 pages
Performance Analysis of Parallel Sorting Algorithms Using MPI
No ratings yet
Performance Analysis of Parallel Sorting Algorithms Using MPI
6 pages
Project Assignment 3 Multi Processor System (DV 2544) : Susheel Sagar
No ratings yet
Project Assignment 3 Multi Processor System (DV 2544) : Susheel Sagar
4 pages
Govind 3
No ratings yet
Govind 3
4 pages
New Performance Analysis of Parallel Image Rotation-1
No ratings yet
New Performance Analysis of Parallel Image Rotation-1
9 pages
Sort Open MP
No ratings yet
Sort Open MP
6 pages
RajSingh HPCexp4
No ratings yet
RajSingh HPCexp4
2 pages
Graph Data Structure
No ratings yet
Graph Data Structure
19 pages
CS Proposal Sample
No ratings yet
CS Proposal Sample
4 pages
PDC Lab 2-5
No ratings yet
PDC Lab 2-5
5 pages
YASH HPC Final
No ratings yet
YASH HPC Final
13 pages
Merge Sort Sequential and Parallel Progr
No ratings yet
Merge Sort Sequential and Parallel Progr
7 pages
Updated DAA Mini Project - Docx (A)
No ratings yet
Updated DAA Mini Project - Docx (A)
24 pages
DAA Report
No ratings yet
DAA Report
13 pages
Program1 PP
No ratings yet
Program1 PP
5 pages
High Performance Computing Labs & Concepts
No ratings yet
High Performance Computing Labs & Concepts
5 pages
Fast Parallel Sort Sub-Rows Qs
No ratings yet
Fast Parallel Sort Sub-Rows Qs
4 pages
MPI Parallelization for Bioinformatics
No ratings yet
MPI Parallelization for Bioinformatics
4 pages
Os M 1 Batch 2 Sec M 1
No ratings yet
Os M 1 Batch 2 Sec M 1
19 pages
Scientific Writing Parallel Computing V2
No ratings yet
Scientific Writing Parallel Computing V2
15 pages
DAA Final Report
No ratings yet
DAA Final Report
18 pages
Daa Miniproject 2 This Is The Mini Project For Daa
No ratings yet
Daa Miniproject 2 This Is The Mini Project For Daa
12 pages
DAA Aniket
No ratings yet
DAA Aniket
16 pages
Section M - Batch 02
No ratings yet
Section M - Batch 02
6 pages
Quick Sort
No ratings yet
Quick Sort
8 pages
Parallel Quick Sort Algorithm
No ratings yet
Parallel Quick Sort Algorithm
8 pages
Data Structures & Algorithms Guide
No ratings yet
Data Structures & Algorithms Guide
15 pages
Parallel Algorithm
No ratings yet
Parallel Algorithm
4 pages
Group A Assignment No: 2 (B) : Title of The Assignment
No ratings yet
Group A Assignment No: 2 (B) : Title of The Assignment
6 pages
Apr 2024 Aids
No ratings yet
Apr 2024 Aids
2 pages
Minimal Spanning Tree PDF
No ratings yet
Minimal Spanning Tree PDF
4 pages
DAA Mini Project Report Atul
No ratings yet
DAA Mini Project Report Atul
17 pages
Practical No. 2
No ratings yet
Practical No. 2
2 pages
Efficient Parallel Sort On AVX-512-based Multi-Core and Many-Core Architectures
No ratings yet
Efficient Parallel Sort On AVX-512-based Multi-Core and Many-Core Architectures
9 pages
Laporan Praktikum Siskom Digital PCM Encoding&Decoding - Fatih
No ratings yet
Laporan Praktikum Siskom Digital PCM Encoding&Decoding - Fatih
23 pages
Shell Sort
No ratings yet
Shell Sort
17 pages
VTU 4th Sem Algorithm Lab Manual
No ratings yet
VTU 4th Sem Algorithm Lab Manual
55 pages
IJRPR41492
No ratings yet
IJRPR41492
8 pages
Polynomials: Fundamental Theorem of Algebra: P (X) Has N Roots
No ratings yet
Polynomials: Fundamental Theorem of Algebra: P (X) Has N Roots
1 page
Ant Colony Optimization Explained
100% (1)
Ant Colony Optimization Explained
13 pages
Daikibo Dashboard Proposal
No ratings yet
Daikibo Dashboard Proposal
2 pages
FVM Presentation
No ratings yet
FVM Presentation
33 pages
CS3491 Improvement Test2 Question
No ratings yet
CS3491 Improvement Test2 Question
1 page
Mid 1 DL Notes
No ratings yet
Mid 1 DL Notes
15 pages
MindStix SDE1 CheatSheet
No ratings yet
MindStix SDE1 CheatSheet
1 page
A Modified Newton's Method For Solving Nonlinear Programing Problems
No ratings yet
A Modified Newton's Method For Solving Nonlinear Programing Problems
15 pages
Big M Method Updated
No ratings yet
Big M Method Updated
33 pages
5
No ratings yet
5
2 pages
Human Evolutionary Algorithm Insights
No ratings yet
Human Evolutionary Algorithm Insights
25 pages
Study Guide 312
No ratings yet
Study Guide 312
1 page
Heap Sort and Quick Sort-2
No ratings yet
Heap Sort and Quick Sort-2
54 pages
15 Dijkstra
No ratings yet
15 Dijkstra
48 pages
Sigmoid Neural Networks To Predict Handwritten Digits
No ratings yet
Sigmoid Neural Networks To Predict Handwritten Digits
16 pages
Example 4.2: Aggregate Planning Models
No ratings yet
Example 4.2: Aggregate Planning Models
36 pages
Assignment 1 PSAC
No ratings yet
Assignment 1 PSAC
4 pages
Algorithms and Flowcharts
No ratings yet
Algorithms and Flowcharts
4 pages
Artificial Intelligence and Machine Learning
No ratings yet
Artificial Intelligence and Machine Learning
64 pages
Session 3 - Graphical Methods
No ratings yet
Session 3 - Graphical Methods
46 pages
A Review of Data Mining Technologies in Building Energy Systems
No ratings yet
A Review of Data Mining Technologies in Building Energy Systems
16 pages
Assignment 5 1
No ratings yet
Assignment 5 1
13 pages
Advanced Motion Estimation Algorithm
No ratings yet
Advanced Motion Estimation Algorithm
23 pages
DSP: Interpolation & Decimation
No ratings yet
DSP: Interpolation & Decimation
32 pages
Program No. 1 AIM: Write A Program To Implement Array Traversal
No ratings yet
Program No. 1 AIM: Write A Program To Implement Array Traversal
37 pages

Full Parallel QuickSort MPI Report

Uploaded by

Full Parallel QuickSort MPI Report

Uploaded by

Evaluate Performance Enhancement of

Parallel Quick Sort Algorithm using MPI

MPI is a standardized communication protocol used in parallel computing architectures

Specifically, we seek to:

The main steps of the algorithm include:

The program structure includes the following components:

Key MPI functions used include:

Sample results include:

A discussion of trade-offs includes acknowledgment that while MPI improves

- Integration with GPU-based parallelism: Technologies like CUDA or OpenCL can be

- Hybrid MPI+OpenMP model: Combining inter-node and intra-node parallelism can

- Dynamic load balancing: Implementing strategies to distribute workload more evenly

- Enhanced merging techniques: Using tree-based or parallel merge algorithms can

- Real cluster deployment: Testing on high-performance computing clusters with

You might also like