mergesort: Use queues to efficiently merge left and right sub-arrays. #83
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
First of all, thanks for sharing this repo and for keeping it updated, you are awesome!
After tinkering with
mergeSort
I noticed that its time performance sharply degrades when we pass a certain input size threshold, just like you have described on your blog post.It seems that the main culprit is the
Array.prototype.shift()
method that is used when mergingleft
andright
subarrays inmergeSort.merge()
. A quick inspection of theArray.prototype.shift()
method specification reveals that this operation takes O(N) time. Theright.shift()
andleft.shift()
operations are nested within a while loop that runs through all elemens of the subarrays. Therefore we don't have an O(n * logn) algorithm but one that seems to be quadratic.Instead of arrays, I propose using the working
LinkedList
data structure to implement a queue ADT and thus spend O(1) instead of O(N) for the same operation. Running a performance test with this approach on an unsorted array of 1M elements dropped the execution time from 8m 39.435s to 0.802s (~650x). More detailed results are here.Moral of the story: Repeatedly calling
Array.prototype.shift()
on big data sets can really hurt performance.