I am thinking about different solutions for one problem. Assume we have K sorted linked lists and we are merging them into one. All these lists together have N elements.
The well known solution is to use priority queue and pop / push first elements from every lists and I can understand why it takes O(N log K)
time.
But let's take a look at another approach. Suppose we have some MERGE_LISTS(LIST1, LIST2)
procedure, that merges two sorted lists and it would take O(T1 + T2)
time, where T1
and T2
stand for LIST1
and LIST2
sizes.
What we do now generally means pairing these lists and merging them pair-by-pair (if the number is odd, last list, for example, could be ignored at first steps). This generally means we have to make the following "tree" of merge operations:
N1, N2, N3...
stand for LIST1, LIST2, LIST3
sizes
O(N1 + N2) + O(N3 + N4) + O(N5 + N6) + ...
O(N1 + N2 + N3 + N4) + O(N5 + N6 + N7 + N8) + ...
O(N1 + N2 + N3 + N4 + .... + NK)
It looks obvious that there will be log(K)
of these rows, each of them implementing O(N)
operations, so time for MERGE(LIST1, LIST2, ... , LISTK)
operation would actually equal O(N log K)
.
My friend told me (two days ago) it would take O(K N)
time. So, the question is - did I f%ck up somewhere or is he actually wrong about this? And if I am right, why this 'divide&conquer' approach can't be used instead of priority queue approach?
Merge sort is often preferred for sorting a linked list. The slow random-access performance of a linked list makes some other algorithms (such as quicksort) perform poorly, and others (such as heapsort) completely impossible.
Merge K Sorted Linked Lists using Min Heap.
This approach takes O(N Log N) time where N is the count of all elements. An efficient solution is to use a heap data structure. The time complexity of the heap-based solution is O(N Log k).
If you have a small number of lists to merge, this pairwise scheme is likely to be faster than a priority queue method because you have extremely few operations per merge: basically just one compare and two pointer reassignments per item (to shift into a new singly-linked list). As you've shown, it is O(N log K)
(log K
steps handling N
items each).
But the best priority queue algorithms are, I believe, O(sqrt(log K))
or O(log log U)
for insert and remove (where U
is the number of possible different priorities)--if you can prioritize with a value instead of having to use a compare--so if you are merging items that can be given e.g. integer priorities, and K
is large, then you're better off with a priority queue.
From your description, it does sound like your process is indeed O(N log K). It also will work, so you can use it.
I personally would use the first version with a priority queue, since I suspect it will be faster. It's not faster in the coarse big-O sense, but I think if you actually work out the number of comparisons and stores taken by both, the second version will take several times more work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With