I implemented selection sort and compared it to Data.List's sort. It is orders of magnitudes faster than Data.List's sort. If I apply it to 10,000 randomly generated numbers the results are as follows:
✓ in 1.22µs: Selection sort
✓ in 9.84ms: Merge sort (Data.List)
This can't be right. First I thought maybe merge sort's intermediate results are cached and selection sort uses those to be much faster. Even when I comment out merge sort and only time selection sort, it is this fast however. I also verified the output and it is correctly sorted.
What causes this behaviour?
I use this code to test:
{-# LANGUAGE BangPatterns #-}
module Lib
( testSortingAlgorithms
) where
import System.Random (randomRIO)
import Text.Printf
import Control.Exception
import System.CPUTime
import Data.List (sort, sortOn)
selectionSort :: Ord a => [a] -> [a]
selectionSort [] = []
selectionSort nrs =
let (smallest, rest) = getSmallest nrs
in smallest : selectionSort rest
where getSmallest :: Ord a => [a] -> (a, [a])
getSmallest [a] = (a, [])
getSmallest (a:as) = let (smallest, rest) = getSmallest as
in if smallest > a then (a, smallest : rest)
else (smallest, a : rest)
main :: IO ()
main = testSortingAlgorithms
testSortingAlgorithms :: IO ()
testSortingAlgorithms = do
!list' <- list (10000)
results <- mapM (timeIt list') sorts
let results' = sortOn fst results
mapM_ (\(diff, msg) -> printf (msg) (diff::Double)) results'
return ()
sorts :: Ord a => [(String, [a] -> [a])]
sorts = [
("Selection sort", selectionSort)
, ("Merge sort (Data.List)", sort)
]
list :: Int -> IO [Int]
list n = sequence $ replicate n $ randomRIO (-127,127::Int)
timeIt :: (Ord a, Show a)
=> [a] -> (String, [a] -> [a]) -> IO (Double, [Char])
timeIt vals (name, sorter) = do
start <- getCPUTime
--v <- sorter vals `seq` return ()
let !v = sorter vals
--putStrLn $ show v
end <- getCPUTime
let (diff, ext) = unit $ (fromIntegral (end - start)) / (10^3)
let msg = if correct v
then (" ✓ in %0.2f" ++ ext ++ ": " ++ name ++ "\n")
else (" ✗ in %0.2f" ++ ext ++ ": " ++ name ++ "\n")
return (diff, msg)
correct :: (Ord a) => [a] -> Bool
correct [] = True
correct (a:[]) = True
correct (a1:a2:as) = a1 <= a2 && correct (a2:as)
unit :: Double -> (Double, String)
unit v | v < 10^3 = (v, "ns")
| v < 10^6 = (v / 10^3, "µs")
| v < 10^9 = (v / 10^6, "ms")
| otherwise = (v / 10^9, "s")
You write
let !v = sorter vals
which is "strict", but only to WHNF. So you are only timing how long it takes to find the smallest element of the list, not how long it takes to sort the whole thing. Selection sort starts by doing exactly that, so it is "optimal" for this incorrect benchmark, while mergesort does a bunch more work that's "wasted" if you only look at the first element.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With