Lab Manual
Subject: Data Structures and Algorithms (DSA)
Experiment Title: Implementation of Bucket Sort
Objective: To understand and implement the Bucket Sort algorithm using C++ programming
language.
Theory: Bucket Sort is a sorting algorithm that distributes elements of an array into several
buckets. Each bucket is then sorted individually, either using a different sorting algorithm or by
recursively applying Bucket Sort. This algorithm is particularly useful when the input is
uniformly distributed over a range.
Algorithm Steps:
1. Create an empty array of buckets.
2. Iterate through the input array and assign each element to the appropriate bucket based on
its value.
3. Sort individual buckets using a suitable sorting technique (e.g., insertion sort).
4. Concatenate all sorted buckets to form the final sorted array.
Advantages:
Efficient for uniformly distributed data.
Can achieve linear time complexity, O(n), in the best case.
Disadvantages:
Not suitable for non-uniformly distributed data.
Requires additional space for buckets.
Applications:
Sorting floating-point numbers in a specific range.
Used in graphics and distributed systems.
C++ Program Code:
#include <iostream>
#include <vector>
#include <algorithm> // for sort()
using namespace std;
// Function to perform bucket sort
void bucketSort(vector<float>& arr) {
int n = arr.size();
if (n <= 0) return;
// Step 1: Create n empty buckets
vector<vector<float>> buckets(n);
// Step 2: Put array elements into different buckets
for (int i = 0; i < n; i++) {
int bucketIndex = n * arr[i]; // Index in the range [0, n-1]
buckets[bucketIndex].push_back(arr[i]);
}
// Step 3: Sort individual buckets
for (int i = 0; i < n; i++) {
sort(buckets[i].begin(), buckets[i].end());
}
// Step 4: Concatenate all sorted buckets into arr
int index = 0;
for (int i = 0; i < n; i++) {
for (float val : buckets[i]) {
arr[index++] = val;
}
}
}
int main() {
// Sample array of floating-point numbers in range [0, 1)
vector<float> arr = {0.897, 0.565, 0.656, 0.1234, 0.665, 0.3434};
cout << "Original array: \n";
for (float num : arr) {
cout << num << " ";
}
cout << endl;
// Perform bucket sort
bucketSort(arr);
cout << "\nSorted array: \n";
for (float num : arr) {
cout << num << " ";
}
cout << endl;
return 0;
}
Input: Original array: 0.897, 0.565, 0.656, 0.1234, 0.665, 0.3434
Output: Sorted array: 0.1234, 0.3434, 0.565, 0.656, 0.665, 0.897
Practical Observations:
1. The choice of the number of buckets significantly impacts performance.
2. Uniform distribution of input ensures the best-case time complexity.
3. Sorting within buckets can be optimized using Insertion Sort for small bucket sizes.
Viva Questions:
1. What is the time complexity of Bucket Sort in the best case?
2. How does Bucket Sort handle duplicate elements?
3. What are the limitations of Bucket Sort?
4. Compare Bucket Sort with other sorting algorithms like Quick Sort and Merge Sort.
Conclusion: Bucket Sort is an efficient sorting algorithm for a specific range of inputs,
especially when the data is uniformly distributed. The implementation requires careful
consideration of bucket size and sorting methodology within buckets to achieve optimal
performance.
Step-by-Step Explanation of Bucket Sort Execution:
Given Array: [0.897, 0.565, 0.656, 0.1234, 0.665, 0.3434]
Number of Buckets (n): 6 (same as array size)
Assumption: Elements are uniformly distributed in the range [0, 1) to ensure valid bucket indices.
Step 1: Create 6 Empty Buckets
Buckets are initialized as empty vectors:
Copy
[0] → [], [1] → [], [2] → [], [3] → [], [4] → [], [5] → [].
Step 2: Distribute Elements into Buckets
For each element, compute the bucket index using bucketIndex = n * arr[i] (truncated to integer).
Elemen Calculation (6 × Element) Bucket Index Action
t
0.897 6 × 0.897 = 5.382 5 Add to Bucket 5
0.565 6 × 0.565 = 3.39 3 Add to Bucket 3
0.656 6 × 0.656 = 3.936 3 Add to Bucket 3
0.1234 6 × 0.1234 = 0.7404 0 Add to Bucket 0
0.665 6 × 0.665 = 3.99 3 Add to Bucket 3
0.3434 6 × 0.3434 = 2.0604 2 Add to Bucket 2
Buckets After Distribution:
Bucket 0: [0.1234]
Bucket 1: []
Bucket 2: [0.3434]
Bucket 3: [0.565, 0.656, 0.665]
Bucket 4: []
Bucket 5: [0.897]
Note:
Truncation (not rounding) ensures indices stay within [0, n-1].
Empty buckets (1 and 4) will be skipped during concatenation.
Step 3: Sort Individual Buckets
Each bucket is sorted using std::sort (ascending order):
Bucket Elements Before Sorting Elements After Sorting Notes
0 [0.1234] [0.1234] Already sorted (single element).
2 [0.3434] [0.3434] Already sorted.
3 [0.565, 0.656, 0.665] [0.565, 0.656, 0.665] Already in ascending order.
5 [0.897] [0.897] Already sorted.
Key Observations:
The code sorts every bucket, even if it contains a single element or is already sorted.
In this example, no reordering was needed.
Step 4: Concatenate All Sorted Buckets
Merge buckets in ascending index order (0 to 5), skipping empty ones:
Order of Concatenation:
1. Bucket 0: 0.1234
2. Bucket 1: [] (skipped)
3. Bucket 2: 0.3434
4. Bucket 3: 0.565 → 0.656 → 0.665
5. Bucket 4: [] (skipped)
6. Bucket 5: 0.897
Final Sorted Array:
[0.1234, 0.3434, 0.565, 0.656, 0.665, 0.897]
Critical Analysis of the Algorithm
1. Assumption of Uniform Distribution:
o The algorithm works optimally if elements are uniformly distributed.
o If elements are clustered in a few buckets (e.g., Bucket 3 in this example), sorting those
buckets dominates the runtime.
2. Edge Cases:
o If an element equals 1.0, n * 1.0 = n, leading to an out-of-bucket index (undefined
behavior).
o The input array here avoids this issue by having elements strictly less than 1.
3. Time Complexity:
o Best Case: O(n) (if all elements land in one bucket and are already sorted).
o Average Case: O (n + k), where k is the cost of sorting buckets (often approximated
as O(n) for uniformly distributed data).
o Worst Case: O(n²) (if all elements land in one bucket and require sorting).
Final Output
The algorithm successfully sorts the input array using bucket sort.
Sorted Array:
[0.1234, 0.3434, 0.565, 0.656, 0.665, 0.897]
Mathematical Derivation of Bucket Sort Complexity
Bucket sort’s time complexity depends on two key factors:
1. Distribution of input elements (uniform vs. skewed).
2. Sorting algorithm used for individual buckets (e.g., O (k log k) vs. O(k2)).
We analyze complexity under the following assumptions:
n = number of elements.
ki = number of elements in the i-th bucket.
Sorting each bucket uses an O(ki log ki) algorithm (e.g., std::sort).
Step 1: Distribute Elements into Buckets
Time: O(n)
Each element is placed into a bucket using O(1) calculations.
Step 2: Sort Individual Buckets
n −1
Time: ∑ O ¿ ¿
i=0
The total time depends on the distribution of ki:
Case 1: Uniform Distribution (Best/Average Case)
Elements are uniformly distributed, so E[ki]≈1.
The sum simplifies to n⋅O(1 log 1)=O(n).
Total Time: O(n)+O(n)=O(n).
Case 2: Skewed Distribution (Worst Case)
All elements fall into one bucket: ki=n.
Sorting this bucket takes O(n log n).
Total Time: O(n)+O(n log n)=O(n log n).
Step 3: Concatenate Buckets
Time: O(n)
Traverse all buckets and merge them.
Summary of Time Complexity
Scenario Time Condition
Complexity
Best Case O(n) Uniform distribution, ki≈1.
Average Case O(n) Uniform distribution (common assumption).
Worst Case O (n log n) All elements in one bucket (with O(k log k) sorting).
Worst Case* O(n2) All elements in one bucket (with O(k2) sorting, e.g., insertion sort).
Key Observations
1. Uniform Distribution:
o Bucket sort achieves linear time O(n) due to balanced buckets.
o Example: Sorting floating-point numbers in [0,1).
2. Non-Uniform Distribution:
o Worst-case complexity depends on the sorting algorithm for buckets.
o With O (k log k) sorting (e.g., merge sort), it remains O(n log n).
3. Space Complexity:
o O(n) for storing n buckets.
Mathematical Proof for Average Case
Assume elements are uniformly distributed in [0,1).
For n buckets, the probability of an element falling into any bucket is p=1/n.
The number of elements in a bucket follows a binomial distribution:
For large nn, this approximates a Poisson distribution with λ=1.
The expected value E [ki log ki] for λ=1 is constant.
n −1
Thus,∑ E ¿ ¿
i=0