Just got my feet wet in sorting algorithm with Haskell. I've implemented insertion-sort and merge-sort <pre class="prettyprint"><code>insert_sort :: (Ord a, Show a) => [a] -> [a] insert_sort keys = foldr f [] keys where f key [] = [key] f key acc = insert key acc insert y [] = [y] insert y (x:xs) | x < y = x : insert y xs | otherwise = y : x : xs merge_sort :: (Ord a, Show a) => [a] -> [a] merge_sort (x:[]) = [x] merge_sort keys = merge (merge_sort (take len keys)) (merge_sort (drop len keys)) where len = length keys `div` 2 merge :: [a] -> [a] -> [a] merge (x:xs) [] = (x:xs) merge [] (y:ys) = (y:ys) merge (x:xs) (y:ys) = if x <= y then x : merge (xs) (y:ys) else y : merge (x:xs) ys </code></pre> Here's how I compared their efficiency: <pre class="prettyprint"><code>insert_sort $ take 100000 $ randomRs (1,100000) $ mkStdGen 1 ::[Int] merge_sort $ take 100000 $ randomRs (1,100000) $ mkStdGen 1 ::[Int] </code></pre> Both of them starts to print out results after a short delay but merge-sort prints much faster. As we know, merge-sort is much faster than insertion-sort for large data sets. I thought that would be shown by how they give results (like a long delay versus a short one) not how they print results. Is it because I use <code>foldr</code> in insertion-sort? What's behind the scene? EDIT: Thx guys. I've heard about lazy evaluation since I started to learn Haskell but yet got the hang of it. Would anybody illustrate a bit more with a small data set, say [5,2,6,3,1,4]? How is it possible to output results before finish sorting with <code>foldr</code> since the first elements comes at last?

OK here's the break down. You want me to print out: <pre class="prettyprint"><code>merge_sort $ take 100000 $ randomRs (1,100000) $ mkStdGen 1 ::[Int] </code></pre> I happen to know that this is a list. So First I'll print out an open brace <pre class="prettyprint"><code>[ </code></pre> Then I'll look for the first element of the list, print that out, and then a comma. That means I have to start evaluating that expression until I can figure out what the first element of the list is. <pre class="prettyprint"><code>merge_sort THUNK0 </code></pre> Well now I need to pattern match. Either THUNK matches <code>(x:[])</code> or it doesn't. But I don't know yet. So I'll evaluate that thunk a bit. I make that thunk produce the first two random numbers (out of 100000). Now I know that it doesn't match the first definition, so I take the second definition of <code>merge_sort</code>. <pre class="prettyprint"><code>merge_sort keys = merge THUNK1 THUNK2 -- keys = THUNK0 </code></pre> Well that's easy enough...it's just a call to merge. I'll expand that definition. Oh crap, there are three different patterns this might match. I guess I should evaluate THUNK1 a little and see if it matches the first definition's pattern, <code>(x:xs)</code> <pre class="prettyprint"><code>merge_sort (take THUNK3 THUNK0) </code></pre> Back to <code>merge_sort</code> again, are we? That means I need to evaluate <code>(take THUNK3 THUNK0)</code> just enough to tell if it matches <code>(x:[])</code> or not. Oh CRAP. <code>take</code> is strict in its first argument...that means I have to fully evaluate THUNK3. Ok...deep breaths... <pre class="prettyprint"><code>len = length THUNK0 `div` 2 </code></pre> Now here's an irritating case. To calculate <code>length</code> on THUNK0 (which is a list), I have to expand the WHOLE SPINE. I don't have to actually calculate the values inside, but I do need to flesh out the structure of the entire list. This, of course, is done one pattern match at a time, determining whether it is <code>[]</code>, or <code>(x:xs)</code>. But as a whole, <code>length</code> is "spine strict". short pause whilst I flesh out the spine of a 100000-element list Phew, got that done. Now I know the length, which means I know <code>len = 500000</code>. THUNK0 is finally fully evaluated! Phew! Where was I? <pre class="prettyprint"><code>merge_sort (take 500000 THUNK3) </code></pre> <hr> And so forth. <code>merge_sort</code> will continue trying to be as lazy as possible. Recursive calls to <code>merge_sort</code> will be as lazy as possible. Eventually, to determine the very first element of the outermost <code>merge_sort</code>, we will need to know the very first element of both recursive calls to <code>merge_sort</code>. And to know the first element of those...we'll need the first element of subsequent recursive calls, etc. So there will be about O(n) work done, because every element needs to be evaluated (performing the random number generation for each one). Then, think of it like a tournament. Each element is paired up against another element. The "winning" (lowest) elements move on to the next round (becoming the first element of the recursive call to the lowest <code>merge_sort</code>s). There is another competition with 1/2 as many combatants, and 1/2 of those (1/4 of the total) move on to the next round, etc. This also turns out to be O(n) work, since the (n/2) comparisons are performed during the first round, and subsequent rounds grow smaller much too quickly to be significant. (The sum 1/2 + 1/4 + 1/8 ... converges at 1, meaning a total of n comparisons are performed.) All in all, O(n) work needs to be performed in order to finally produce the first element. Additional work needs to be performed for subsequent elements, but the total amount of work ends up being O(n log(n)). <hr> Now contrast this with <code>insert_sort</code>. It too produces the first element in O(n) time, but the total amount of work is O(n2) in the worst case: it "inserts" the list's elements into the sorted list one by one, and traverses this sorted list potentially to its very end each time.

the way merge-sort faster than insertion-sort puzzles me

Tags:

algorithm

sorting

haskell

lazy-evaluation

Just got my feet wet in sorting algorithm with Haskell. I've implemented insertion-sort and merge-sort

insert_sort :: (Ord a, Show a) => [a] -> [a]
insert_sort keys = foldr f [] keys
           where f key []        = [key]
                 f key acc       = insert key acc
                 insert y []     = [y]
                 insert y (x:xs)
                     | x < y     = x : insert y xs
                     | otherwise = y : x : xs

merge_sort :: (Ord a, Show a) => [a] -> [a]
merge_sort (x:[]) = [x]
merge_sort keys   = merge  (merge_sort (take len keys)) (merge_sort (drop len keys))
      where len         = length keys `div` 2
            merge :: [a] -> [a] -> [a]
            merge (x:xs) []     = (x:xs)
            merge []     (y:ys) = (y:ys)
            merge (x:xs) (y:ys) = if x <= y
                                  then x : merge (xs) (y:ys)
                                  else y : merge (x:xs) ys

Here's how I compared their efficiency:

insert_sort $ take 100000 $ randomRs (1,100000) $ mkStdGen 1 ::[Int]
merge_sort $ take 100000 $ randomRs (1,100000) $ mkStdGen 1 ::[Int]

Both of them starts to print out results after a short delay but merge-sort prints much faster. As we know, merge-sort is much faster than insertion-sort for large data sets. I thought that would be shown by how they give results (like a long delay versus a short one) not how they print results. Is it because I use foldr in insertion-sort? What's behind the scene?

EDIT: Thx guys. I've heard about lazy evaluation since I started to learn Haskell but yet got the hang of it. Would anybody illustrate a bit more with a small data set, say [5,2,6,3,1,4]? How is it possible to output results before finish sorting with foldr since the first elements comes at last?

389

asked Jan 19 '12 01:01

manuzhang

2 Answers

Behind the scene is lazy evaluation. The start of the sorted lists is determined before the sort is complete, so it can be output before the work is finished. Since a mergesort is faster, the merge-sorted list is printed out faster.

As requested: how sorting [5,2,6,3,1,4] proceeds. I use insert_sort = foldr ins [] for brevity.

insert_sort [5,2,6,3,1,4]
  = foldr ins [] [5,2,6,3,1,4]
  = 5 `ins` foldr ins [] [2,6,3,1,4]
  = 5 `ins` 2 `ins` [6,3,1,4] ...
  = 5 `ins` 2 `ins` 6 `ins` 3 `ins` 1 `ins` 4 `ins` []
  = 5 `ins` 2 `ins` 6 `ins` 3 `ins` 1 `ins` (4:[])
  = 5 `ins` 2 `ins` 6 `ins` 3 `ins` (1:4:[])
  = 5 `ins` 2 `ins` 6 `ins` (1 : (3 `ins` (4:[])))
  = 5 `ins` 2 `ins` (1 : (6 `ins` (3 `ins` (4:[]))))
  = 5 `ins` (1 : (2 `ins` (6 `ins` (3 `ins` (4:[])))))
  = 1 : (5 `ins` (2 `ins` (6 `ins` (3 `ins` (4:[])))))  -- now 1 can be output
  = 1 : (5 `ins` (2 `ins` (6 `ins` (3:4:[]))))
  = 1 : (5 `ins` (2 `ins` (3 : (6 `ins` (4:[])))))
  = 1 : (5 `ins` (2 : (3 : (6 `ins` (4:[])))))
  = 1 : 2 : (5 `ins` (3 : (6 `ins` (4:[]))))            -- now 2 can be output
  = 1 : 2 : 3 : (5 `ins` (6 `ins` (4:[])))              -- now 3
  = 1 : 2 : 3 : (5 `ins` (4:6:[]))
  = 1 : 2 : 3 : 4 : (5 `ins` (6:[]))                    -- now 4
  = 1 : 2 : 3 : 4 : 5 : 6 : []                          -- done

And merge sort (abbreviations: merge = mg, merge_sort = ms):

merge_sort [5,2,6,3,1,4]
  = mg (ms [5,2,6]) (ms [3,1,4])
  = mg (mg (ms [5]) (ms [2,6])) (mg (ms [3]) (ms [1,4]))
  = mg (mg [5] (mg [2] [6])) (mg [3] (mg [1] [4]))
  = mg (mg [5] [2,6]) (mg [3] [1,4])
  = mg (2 : mg [5] [6]) (1 : mg [3] [4])
  = 1 : mg (2 : mg [5] [6]) (mg [3] [4])                -- now 1 can be output
  = 1 : mg (2 : mg [5] [6]) [3,4]
  = 1 : 2 : mg (mg [5] [6]) [3,4]                       -- now 2 can be output
  = 1 : 2 : mg [5,6] [3,4]
  = 1 : 2 : 3 : mg [5,6] [4]                            -- now 3
  = 1 : 2 : 3 : 4 : mg [5,6] []                         -- now 4
  = 1 : 2 : 3 : 4 : 5 : 6 : []                          -- now 5 and 6

Admittedly I've taken a few short cuts, but Haskell isn't the only lazy one.

145

answered Oct 22 '22 06:10

Daniel Fischer

OK here's the break down. You want me to print out:

merge_sort $ take 100000 $ randomRs (1,100000) $ mkStdGen 1 ::[Int]

I happen to know that this is a list. So First I'll print out an open brace

Then I'll look for the first element of the list, print that out, and then a comma. That means I have to start evaluating that expression until I can figure out what the first element of the list is.

merge_sort THUNK0

Well now I need to pattern match. Either THUNK matches (x:[]) or it doesn't. But I don't know yet. So I'll evaluate that thunk a bit. I make that thunk produce the first two random numbers (out of 100000). Now I know that it doesn't match the first definition, so I take the second definition of merge_sort.

merge_sort keys = merge THUNK1 THUNK2 -- keys = THUNK0

Well that's easy enough...it's just a call to merge. I'll expand that definition. Oh crap, there are three different patterns this might match. I guess I should evaluate THUNK1 a little and see if it matches the first definition's pattern, (x:xs)

merge_sort (take THUNK3 THUNK0)

Back to merge_sort again, are we? That means I need to evaluate (take THUNK3 THUNK0) just enough to tell if it matches (x:[]) or not. Oh CRAP. take is strict in its first argument...that means I have to fully evaluate THUNK3. Ok...deep breaths...

len = length THUNK0 `div` 2

Now here's an irritating case. To calculate length on THUNK0 (which is a list), I have to expand the WHOLE SPINE. I don't have to actually calculate the values inside, but I do need to flesh out the structure of the entire list. This, of course, is done one pattern match at a time, determining whether it is [], or (x:xs). But as a whole, length is "spine strict".

short pause whilst I flesh out the spine of a 100000-element list

Phew, got that done. Now I know the length, which means I know len = 500000. THUNK0 is finally fully evaluated! Phew! Where was I?

merge_sort (take 500000 THUNK3)

And so forth. merge_sort will continue trying to be as lazy as possible. Recursive calls to merge_sort will be as lazy as possible. Eventually, to determine the very first element of the outermost merge_sort, we will need to know the very first element of both recursive calls to merge_sort. And to know the first element of those...we'll need the first element of subsequent recursive calls, etc. So there will be about O(n) work done, because every element needs to be evaluated (performing the random number generation for each one).

Then, think of it like a tournament. Each element is paired up against another element. The "winning" (lowest) elements move on to the next round (becoming the first element of the recursive call to the lowest merge_sorts). There is another competition with 1/2 as many combatants, and 1/2 of those (1/4 of the total) move on to the next round, etc. This also turns out to be O(n) work, since the (n/2) comparisons are performed during the first round, and subsequent rounds grow smaller much too quickly to be significant. (The sum 1/2 + 1/4 + 1/8 ... converges at 1, meaning a total of n comparisons are performed.)

All in all, O(n) work needs to be performed in order to finally produce the first element. Additional work needs to be performed for subsequent elements, but the total amount of work ends up being O(n log(n)).

Now contrast this with insert_sort. It too produces the first element in O(n) time, but the total amount of work is O(n²) in the worst case: it "inserts" the list's elements into the sorted list one by one, and traverses this sorted list potentially to its very end each time.

answered Oct 22 '22 06:10

Dan Burton

Related questions
                            
                                Find out n numbers of missing elements from an array in java
                            
                                Algorithm to find which numbers from a list of size n sum to another number
                            
                                Convert byte array into any base
                            
                                Why is quicksort faster in average than others?
                            
                                What is the most efficient purely functional algorithm for generating all prefixes of a list?
                            
                                Difference between Time Complexity and Running time
                            
                                Avoiding overflow in integer multiplication followed by division
                            
                                Fastest algorithm to hop through an array
                            
                                Sort points by angle from given axis?
                            
                                Alpha-beta pruning for Minimax
                            
                                possibility of starvation of Dining Philosophers
                            
                                Algorithm for topological sorting if cycles exist
                            
                                Find the maximum element which is common in two arrays?
                            
                                How waterproof is a Swiss cheese? [closed]
                            
                                Convert floating-point numbers to decimal digits in GLSL
                            
                                Fastest way to find minimal product of 2 array elements containing 200000+ elements
                            
                                Best algorithm for evaluating a mathematical expression?
                            
                                Binary String to Decimal String
                            
                                Date to Day of the week algorithm?
                            
                                CodeJam 2011: Solution for Gorosort?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With