Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimizing radix sort in Haskell

I'm still learning Haskell and I wrote following radix sort function. It seems to work correctly, but the problem is that it is rather memory inefficient. If compiled with ghc, the memory goes highly over 500MB already with input list of size 10000 elements.

So I want to ask you how could the following algorithm/code improved to make it more efficient in terms of speed and memory. What is the best place to start?

import System.Random

-- radixsort for positive integers. uses 10 buckets
radixsort :: [Int] -> [Int]
radixsort [] = []
radixsort xs =
    -- given the data, get the number of passes that are required for sorting
    -- the largest integer
    let maxPos = floor ((log (fromIntegral (foldl max 0 xs)) / log 10) + 1)

        -- start sorting from digit on position 0 (lowest position) to position 'maxPos'
        radixsort' ys pos
         | pos < 0   = ys
         | otherwise = let sortedYs   = radixsort' ys (pos - 1)
                           newBuckets = radixsort'' sortedYs [[] | i <- [1..10]] pos
                       in  [element | bucket <- newBuckets, element <- bucket]

        -- given empty buckets, digit position and list, sort the values into
        -- buckets
        radixsort'' []     buckets _   = buckets
        radixsort'' (y:ys) buckets pos =
            let digit = div (mod y (10 ^ (pos + 1))) (10 ^ pos)
                (bucketsBegin, bucketsEnd) = splitAt digit buckets
                bucket = head bucketsEnd
                newBucket = bucket ++ [y]
            in radixsort'' ys (bucketsBegin ++ [newBucket] ++ (tail bucketsEnd)) pos
    in radixsort' xs maxPos

-- get an random array given an seed
getRandIntArray :: Int -> [Int] 
getRandIntArray seed = (randomRs (0, div (maxBound :: Int) 2) (mkStdGen seed))

main = do
        value <- (\x -> return x ) (length (radixsort (take 10000 (getRandIntArray 0))))
        print value
like image 854
Timo Avatar asked Mar 11 '11 19:03

Timo


1 Answers

The main problem is your function radixsort'', because ++ is O(n) and it copies each time the list given as the first argument.

pack (-1) r' _ = r'
pack n  r' relems =
    let getn = (map snd) . (filter ((n==) . fst))
    in pack (n - 1) ((getn relems):r') relems
radixsort'' elems pos = 
    let digit = \y -> div (mod y (10 ^ (pos + 1))) (10 ^ pos)
        relems = zip (map digit elems) elems
    in pack 9 [] relems
like image 199
Kru Avatar answered Sep 23 '22 23:09

Kru