EDIT:
It turns out the slow version is actually an insertion sort O(n^2) rather than a merge sort O(n log n) explaining the performance issue. I thought I'd save future readers the pain of wading through the code to discover this answer.
ORIGINAL BEGINS HERE -------------------------------------------
I've written two versions of merge sort in haskell and I can't grok why the one is 1000 times faster than the other. In both cases we start by making item in the list to be sorted a list creating a list of list. Then we pair up lists and merge them until only one list remains. The problem seems to be that I'm calling "doMerge (x1:x2:xs) = doMerge $ merge x1 x2 : doMerge xs" in the slow version but calling doMerge (mergePairs xs) in the fast version. I'm surprised by the 1000x speed difference!
-- Better version: takes 0.34 seconds to sort a 100,000 integer list.
betMergeSort :: [Int] -> [Int]
betMergeSort list = doMerge $ map (\x -> [x]) list
where
doMerge :: [[Int]] -> [Int]
doMerge [] = []
doMerge [xs] = xs
doMerge xs = doMerge (mergePairs xs)
mergePairs :: [[Int]] -> [[Int]]
mergePairs (x1:x2:xs) = merge x1 x2 : mergePairs xs
mergePairs xs = xs
-- expects two sorted lists and returns one sorted list.
merge :: [Int] -> [Int] -> [Int]
merge [] ys = ys
merge xs [] = xs
merge (x:xs) (y:ys) = if x <= y
then x : merge xs (y:ys)
else y : merge (x:xs) ys
-- Slow version: takes 350 seconds to sort a 100,000 integer list.
slowMergeSort :: [Int] -> [Int]
slowMergeSort list = head $ doMerge $ map (\x -> [x]) list
where
doMerge :: [[Int]] -> [[Int]]
doMerge [] = []
doMerge (oneList:[]) = [oneList]
doMerge (x1:x2:xs) = doMerge $ merge x1 x2 : doMerge xs
-- expects two sorted list and returns one sorted list.
merge :: [Int] -> [Int] -> [Int]
merge [] ys = ys
merge xs [] = xs
merge (x:xs) (y:ys) = if x <= y then x : merge xs (y:ys) else y : merge (x:xs) ys
Looking at the profiler output, it is clear that the slow version is allocating way more memory. I'm not sure why though. Both versions seem similar in my head. Can someone explain why the allocation is so different?
slowMergeSort Profiling result:
Wed Aug 21 12:24 2013 Time and Allocation Profiling Report (Final)
mergeSort +RTS -sstderr -p -RTS s
total time = 12.02 secs (12017 ticks @ 1000 us, 1 processor)
total alloc = 17,222,571,672 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
slowMergeSort.merge Main 99.2 99.4
individual inherited
COST CENTRE MODULE no. entries %time %alloc %time %alloc
MAIN MAIN 74 0 0.0 0.0 100.0 100.0
main Main 149 0 0.0 0.0 100.0 100.0
main.sortedL Main 165 1 0.0 0.0 99.3 99.5
slowMergeSort Main 167 1 0.0 0.0 99.3 99.5
slowMergeSort.\ Main 170 40000 0.0 0.0 0.0 0.0
slowMergeSort.doMerge Main 168 79999 0.0 0.0 99.2 99.5
slowMergeSort.merge Main 169 267588870 99.2 99.4 99.2 99.4
main.sortVersion Main 161 1 0.0 0.0 0.0 0.0
randomInts Main 151 1 0.0 0.0 0.7 0.5
force Main 155 1 0.0 0.0 0.0 0.0
force.go Main 156 40001 0.0 0.0 0.0 0.0
randomInts.result Main 152 1 0.7 0.5 0.7 0.5
libMergeSort Profiling
Wed Aug 21 12:23 2013 Time and Allocation Profiling Report (Final)
mergeSort +RTS -sstderr -p -RTS l
total time = 0.12 secs (124 ticks @ 1000 us, 1 processor)
total alloc = 139,965,768 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
randomInts.result Main 66.9 64.0
libMergeSort.merge Main 24.2 30.4
main Main 4.0 0.0
libMergeSort Main 2.4 3.2
libMergeSort.merge_pairs Main 1.6 2.3
individual inherited
COST CENTRE MODULE no. entries %time %alloc %time %alloc
MAIN MAIN 74 0 0.0 0.0 100.0 100.0
main Main 149 0 4.0 0.0 100.0 100.0
main.sortedL Main 165 1 0.0 0.0 28.2 35.9
libMergeSort Main 167 1 2.4 3.2 28.2 35.9
libMergeSort.\ Main 171 40000 0.0 0.0 0.0 0.0
libMergeSort.libMergeSort' Main 168 17 0.0 0.0 25.8 32.7
libMergeSort.merge_pairs Main 169 40015 1.6 2.3 25.8 32.7
libMergeSort.merge Main 170 614711 24.2 30.4 24.2 30.4
main.sortVersion Main 161 1 0.0 0.0 0.0 0.0
randomInts Main 151 1 0.0 0.0 67.7 64.0
force Main 155 1 0.0 0.0 0.8 0.0
force.go Main 156 40001 0.8 0.0 0.8 0.0
randomInts.result Main 152 1 66.9 64.0 66.9 64.0
The second of these is O(n^2)
and should not be used as it is the wrong algorithm (shouldn't be called mergesort).
doMerge (x1:x2:xs) = doMerge $ merge x1 x2 : doMerge xs
completly sorts xs
which is only a constant shorter than the original list. Merging a very short list with a very long one is equivalent to insertion, so we see that this is really just insertion sort. The entire point of mergesort is divide and conquer, this is not divide and conquer. As the length of the lists grow greater the speed ratio will continue to get worse.
The first is a proper merge sort as it merges lists of about even length.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With