Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merging k sorted linked lists - analysis

I am thinking about different solutions for one problem. Assume we have K sorted linked lists and we are merging them into one. All these lists together have N elements.

The well known solution is to use priority queue and pop / push first elements from every lists and I can understand why it takes O(N log K) time.

But let's take a look at another approach. Suppose we have some MERGE_LISTS(LIST1, LIST2) procedure, that merges two sorted lists and it would take O(T1 + T2) time, where T1 and T2 stand for LIST1 and LIST2 sizes.

What we do now generally means pairing these lists and merging them pair-by-pair (if the number is odd, last list, for example, could be ignored at first steps). This generally means we have to make the following "tree" of merge operations:

N1, N2, N3... stand for LIST1, LIST2, LIST3 sizes

  • O(N1 + N2) + O(N3 + N4) + O(N5 + N6) + ...
  • O(N1 + N2 + N3 + N4) + O(N5 + N6 + N7 + N8) + ...
  • O(N1 + N2 + N3 + N4 + .... + NK)

It looks obvious that there will be log(K) of these rows, each of them implementing O(N) operations, so time for MERGE(LIST1, LIST2, ... , LISTK) operation would actually equal O(N log K).

My friend told me (two days ago) it would take O(K N) time. So, the question is - did I f%ck up somewhere or is he actually wrong about this? And if I am right, why this 'divide&conquer' approach can't be used instead of priority queue approach?

like image 391
M. Williams Avatar asked Apr 24 '10 17:04

M. Williams


People also ask

Is merge sort good for linked lists?

Merge sort is often preferred for sorting a linked list. The slow random-access performance of a linked list makes some other algorithms (such as quicksort) perform poorly, and others (such as heapsort) completely impossible.

Which data structure can be used to optimally merge K singly linked lists?

Merge K Sorted Linked Lists using Min Heap.

What is the time complexity of the fastest algorithm to merge k sorted arrays of size n each?

This approach takes O(N Log N) time where N is the count of all elements. An efficient solution is to use a heap data structure. The time complexity of the heap-based solution is O(N Log k).


2 Answers

If you have a small number of lists to merge, this pairwise scheme is likely to be faster than a priority queue method because you have extremely few operations per merge: basically just one compare and two pointer reassignments per item (to shift into a new singly-linked list). As you've shown, it is O(N log K) (log K steps handling N items each).

But the best priority queue algorithms are, I believe, O(sqrt(log K)) or O(log log U) for insert and remove (where U is the number of possible different priorities)--if you can prioritize with a value instead of having to use a compare--so if you are merging items that can be given e.g. integer priorities, and K is large, then you're better off with a priority queue.

like image 174
Rex Kerr Avatar answered Oct 03 '22 12:10

Rex Kerr


From your description, it does sound like your process is indeed O(N log K). It also will work, so you can use it.

I personally would use the first version with a priority queue, since I suspect it will be faster. It's not faster in the coarse big-O sense, but I think if you actually work out the number of comparisons and stores taken by both, the second version will take several times more work.

like image 37
Sean Owen Avatar answered Oct 03 '22 13:10

Sean Owen