I've been reading "Algorithms, 4th Ed" by Sedgewick & Wayne, and along the way I've been implementing the algorithms discussed in JavaScript.
I recently took the mergesort examples provided in the book to compare top-down and bottom-up approaches... but I'm finding that bottom-up is running faster (I think). See my analysis on my blog. - http://www.akawebdesign.com/2012/04/13/javascript-mergesort-top-down-vs-bottom-up/
I have not been able to find any discussion that says one method of mergesort should be faster than the other. Is my implementation (or analysis) flawed?
Note: my analysis measures the iterative loops of the algorithm, not strictly the array compares/moves. Perhaps this is flawed or irrelevant?
EDIT: My analysis didn't actually time the speed, so my statement about it running "faster" is a bit misleading. I am tracking the "iterations" through the recursive method (top-down) and the for loops (bottom-up) - and bottom-up appears to use fewer iterations.
Merge sort is more efficient and works faster than quick sort in case of larger array size or datasets. Quick sort is more efficient and works faster than merge sort in case of smaller array size or datasets. Sorting method : The quick sort is internal sorting method where the data is sorted in main memory.
Merge sort is very efficient for sorting linked lists since linked lists cannot be randomly accessed, and in merge sort, we don't require random access, while in quicksort, we need to randomly access elements. Quicksort is very efficient for sorting small datasets.
Quick sort is an in-place sorting algorithm. In-place sorting means no additional storage space is needed to perform sorting. Merge sort requires a temporary array to merge the sorted arrays and hence it is not in-place giving Quick sort the advantage of space.
Merge Sort is an efficient, stable sorting algorithm with an average, best-case, and worst-case time complexity of O(n log n).
I have not been able to find any discussion that says one method of mergesort should be faster than the other.
Bottom-up and top-down merge sorts, as well as other variants, have been well studied during the 90s. In a nutshell, if you measure the cost as the number of comparisons of individual keys, the best costs are the same (~ (n lg n)/2), the worst cost of top-down is lower than or equal to the worst case of bottom-up (but both ~ n lg n) and the average cost of top-down is lower than or equal to the average case of bottom-up (but both ~ n lg n), where "lg n" is the binary logarithm. The differences stem from the linear terms. Of course, if n=2^p, the two variants are in fact exactly the same. This means that, comparison-wise, top-down is always better than bottom-up. Furthermore, it has been proved that the "half-half" splitting strategy of top-down merge sort is optimal. The research papers are from Flajolet, Golin, Panny, Prodinger, Chen, Hwang and Sedgewick.
Here is what I came up in my book Design and Analysis of Purely Functional Programs (College Publications, UK), in Erlang:
tms([X|T=[_|U]]) -> cutr([X],T,U);
tms(T) -> T.
cutr(S,[Y|T],[_,_|U]) -> cutr([Y|S],T,U);
cutr(S, T, U) -> mrg(tms(S),tms(T)).
mrg( [], T) -> T;
mrg( S, []) -> S;
mrg(S=[X|_],[Y|T]) when X > Y -> [Y|mrg(S,T)];
mrg( [X|S], T) -> [X|mrg(S,T)].
Note that this is not a stable sort. Also, in Erlang (and OCaml), you need to use aliases (ALIAS=...) in the patterns if you want to save memory. The trick here is to find the middle of the list without knowing its length. This is done by cutr/3 which handles two pointers to the input list: one is incremented by one and the other by two, so when the second reaches the end, the first one is in the middle. (I learnt this from a paper by Olivier Danvy.) This way, you don't need to keep track of the length and you don't duplicate the cells of the second half of the list, so you only need (1/2)n lg n extra space, versus n lg n. This is not well known.
It is often claimed that the bottom-up variant is preferable for functional languages or linked list (Knuth, Panny, Prodinger), but I don't think this is true.
I was puzzled like you by the lack of discussion on merge sorts, so I did my own research and wrote a large chapter about it. I am currently preparing a new edition with more material on merge sorts.
By the way, there are other variants: queue merge sort and on-line merge sort (I discuss the latter in my book).
[EDIT: As the measure for the cost is the number of comparisons, there is no difference between choosing an array versus a linked list. Of course, if you implement the top-down variant with linked lists, you have to be clever, as you don't necessarily know the number of keys, but you'll need to traverse a least half the keys, each time, and reallocate, in total (1/2)n lg n cells (if you are clever). Bottom-up merge sort with linked lists actually requires more extra memory, n lg n + n cells. So, even with linked lists, the top-down variant is the best choice. As far as the length of the program goes, your mileage may vary, but in a functional language, top-down merge sort can be made shorter than bottom-up, if stability is not required. There are some papers that discuss implementations issues of merge sort, like in-place (for which you need arrays), or stability etc. For instance, A Meticulous Analysis of Mergesort Programs, by Katajainen and Larsson Traff (1997).]
I had asked the same question on coursera class forums for the 2012 August edition of this course. Professor Kevin wayne (of Princeton) replied that in many cases recursion is faster than iteration because of caching improved performances.
So the short answer that I got at that time was that top down merge sort will be faster than bottom up merge sort because of caching reasons.
Please note that the class was taught in Java programming language(not Javascript).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With