Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does quicksort in Haskell work?

On the Haskell website, there's this example quicksort implementation:

quicksort :: Ord a => [a] -> [a]
quicksort []     = []
quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
    where
        lesser  = filter (< p) xs
        greater = filter (>= p) xs

There is an explanation on the site, but I have a couple of questions that I didn't see were addressed ...

  • where is the actual comparison/swap done on two elements for a re-order? Is this handled by the 'Ord' (ordered) type definition itself. So the type enforces this condition of being ordered?
  • the 'greater' filter defines items '>= p' (the pivot), so doesn't this mean we'll end up with an extra pivot [p] in resulting list of the function, due to the '++ [p]' item?
like image 788
dodgy_coder Avatar asked Apr 16 '12 01:04

dodgy_coder


2 Answers

  1. There is no swap, because this is not the (almost-)in-place version of QS. Instead, new lists are built and then concatenated — comparison is done when lesser and greater are created, with <, >=Ord is a typeclass restricting a to be orderable — if it wasn't used, you wouldn't be able to use < or >=.
  2. No, because the pivot is not part of xs — pattern match splits input list into p and xs.

Here's crappy ASCII visualisation:

                                qs [5, 5, 6, 3, 1]
                                          |
                         qs [3, 1]   ++  [5] ++ qs [5, 6]
                             |            |       |
                  qs [1] ++ [3] ++ qs []  |    qs [] ++ [5] ++ qs [6]
                             |            |       |
                           [1, 3]    ++  [5]  ++ [5, 6]
                             \            |        /
                              \-------------------/
                                        |
                                  [1, 3, 5, 5, 6]
like image 161
Cat Plus Plus Avatar answered Oct 21 '22 20:10

Cat Plus Plus


where is the actual comparison/swap done on two elements for a re-order? Is this handled by the Ord (ordered) type definition itself. So the type enforces this condition of being ordered?

What does Ord mean?

Ord just means that a should be comparable with itself or in stricter terms operations such as >, <, and == should be defined for a. You can think of it as a constraint on the method.

So, where is the ordering done?

And the answer is the last pattern:

quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
    where
        lesser  = filter (< p) xs
        greater = filter (>= p) xs

At run time, the program is going to get an array and the array must meet either of these two patterns:

Pattern 1#: It is empty, in which case the function returns that same empty array and stops.

Pattern 2#: It is not empty or in other words, there is a head element p appended to a tailing array xs. In such a case, the function is told to put p in the middle, put all elements of xs that are less than p on the left (as defined by lesser) of p and all elements of xs that are greater than or equal to p on the right of p. Furthermore, the function is finally told to apply itself (i.e., the same function quicksort) on lesser (which as we defined above, is the array on the left hand side of p) and greater (which as we defined above, is the array on the right hand side of p). As you can see, this will go on till you are left with a zero sized array and pattern 1# terminates the function.

Finally, whenever those recursive calls terminate the function shall return the array:

sortedlesser ++ p ++ sortedgreater 

where sortedlesser is the array that resulted from the application of quicksort on lesser and sortedgreater is the array that resulted from the application of quicksort on greater.

Wait… are we not duplicating p again and again?

the 'greater' predicate defines items '>= p' (the pivot), so doesn't this mean we'll end up with an extra pivot [p] in resulting list of the function, due to the '++ [p]' item?

No, this is not how pattern matching works. It is saying all elements in xs that are greater than or equal to p. By definition p itself is out of xs. If there are duplicates of p in xs then they will fall on the right hand side. Note that this choice will preserve the natural ordering of the original array.

like image 22
Apoorv Avatar answered Oct 21 '22 18:10

Apoorv