I have m arrays, every array is of length n. Each array is sorted. I want to create a single array of length m*n, containing all the values of the previous arrays (including repeating values), sorted. I have to merge these arrays..
I think the optimum time complexity is m*n*log(m)
Here's the sketch of the algorithm..
I create a support array H of lenth m, containing all the values of the first element of each array.
I then sort this array (m log m), and move the min value to the output array.
I then replace the moved value with the next one, from the array it was taken. Actually I don't replace it, but I insert it in the right (sorted) position. This take log m I think.
And I repeat this for all m*n values... therefore m*n*log m
My question.. can you think of a more efficient algorithm? If mnlogm is actually optimum, can you at least think of a simpler, more elegant algorith?
Time Complexity Merging elements of 'ARR1' and 'ARR2' in 'ARR3' takes O(M + N) time. Updating 'ARR1' elements with new values takes O(M) time. Updating 'ARR2' elements with new values takes O(N) time. So, the overall time complexity is O(M + N) + O(M) + O(N) = O(M + N).
The list of size N is divided into a max of Logn parts, and the merging of all sublists into a single list takes O(N) time, the worst-case run time of this algorithm is O(nLogn) Best Case Time Complexity: O(n*log n) Worst Case Time Complexity: O(n*log n) Average Time Complexity: O(n*log n) The time complexity of ...
The insertion and deletion of elements in a sorted array executes at O(n), due to the need to shift all the elements following the element to be inserted or deleted; in comparison a self-balancing binary search tree inserts and deletes at O(log n).
Use insertion sort for small subarrays. We can improve most recursive algorithms by handling small cases differently. Switching to insertion sort for small subarrays will improve the running time of a typical mergesort implementation by 10 to 15 percent.
The complexity is right! However, there's a small flaw in your algorithm idea: You cannot insert an item in a sorted array in log m
. You can find its position using binary search in that complexity, but you might have to move elements around to actually place it there. To fix this problem, you can use a heap data-structure instead!
Multi-way merge (which is the common name of your algorithm) is usually implemented with yet another 'merging' data-structure: the tournament-tree. You can find a description in Knuth's "The Art of Computer Programming" (Chapter on Sorting, iirc). It has a lower constant factor in theory and in practice when compared to heaps in this specific case.
If you want to look implementations, I'm pretty sure that the parallel multi-way merge in the GNU C++ Standard library parallel-extensions is implemented this way.
Edit: I referenced the wrong book, which is fixed now.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With