I understand that:
head (map (2**) [1..999999])
Will only actually evaluate 2**1, and none of the rest, but the book I am reading says that:
head (sort somelist)
Will only need to find the smallest item in the list, because that is all that is used. How does this work? As far as I can tell, this would be impossible with the sorting algorithms I know (like bubble sorting).
The only way I can think that this would work is if the sorting algorithm were to go through the entire list looking for the smallest item, and then recurse on the list without that item. To me, this sounds really slow.
Is this how the sort function works, or is there another sorting algorithm I don't know about, that would allow for short circuiting like it is?
This:
Will only need to find the smallest item in the list, because that is all that is used.
... should really say that the function only needs to do the minimal amount of work that the sorting algorithm requires to find the smallest element.
For example, if we are using quicksort as our underlying sorting algorithm, then head . quicksort
is equivalent to the optimal (!) selection algorithm known as 'quickselect', which is worst-case linear. Moreover, we can implement k-quickselect merely by take k . quicksort
.
Wikipedia notes in its article upon selection algorithms that (my emphasis):
Because language support for sorting is more ubiquitous, the simplistic approach of sorting followed by indexing is preferred in many environments despite its disadvantage in speed. Indeed, for lazy languages, this simplistic approach can even get you the best complexity possible for the k smallest/greatest sorted (with maximum/minimum as a special case) if your sort is lazy enough.
Quicksort works well in this scenario, whereas the default sort in Haskell (merge sort) doesn't compose quite as well, as it does more work than strictly necessary to return each element of the sorted list. As this post on the Haskell mailing list notes:
lazy quicksort is able to produce the batch of the first k smallest elements in
O(n + k log k) total time [1]
while lazy mergesort needs
O(n + k log n) total time [2]
For more you might like to read this blog post.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With