Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Puzzling memory behavior in Haskell

Tags:

arrays

haskell

I am trying to compare the performance of Haskell lists and arrays and run into a strange behavior. I observed that if I create a array and then print it it takes 'x' MB of memory but if I convert the array into a list using the 'elems' function and then print it takes memory less than 'x'. Isn't the 'elems' function supposed to create a new list from the array? Shouldn't it take more space than the function which does not create the list? I am using the -O2 and -fspec-constr optimizing flags. I am also using the -p flag to profile the functions.

This is the code without intermediate list,

fun  n = runST $  do
      final_soln <- newArray_ ((1,1), (n, (10 ^ n))) :: ST s ((STUArray s)  (Int,Int) Int)
      U.unsafeFreeze final_soln

main = do 
    [n] <- getArgs
    {-#  SCC "file-writing" #-} (writeFile "cprod-starray-creation.txt"  $ show $ fun (read n))

This is the code with the intermediate list,

fun :: Int -> UArray (Int,Int) Int
fun  n = runST $  do
      final_soln <- newArray_ ((1,1), (n, (10 ^ n))) :: ST s ((STUArray s)  (Int,Int) Int)
      U.unsafeFreeze final_soln

main = do 
    [n] <- getArgs
    {-#  SCC "file-writing" #-} (writeFile "cprod-starray-creation.txt"  $ show $ elems $ fun (read n))

Thanks in advance

like image 874
prasannak Avatar asked Oct 08 '12 11:10

prasannak


1 Answers

There is a lack of lazyness in the first variant, and it is not your fault. Comparing the profiling output (+RTS -hd) of a run with parameter 6 gives the first hint:

Profiling of the first code

is the profiling output of the first code, while

Profiling of the first code

is the result of the second code. The array itself (ARR_WORDS) takes the same space in both. You also see that the first code produces a large list (recognizable by the : constructor) of Int constructors (which happen to have the name Int#).

Now this cannot be the final String as printed, as that would be Chars then (with constructor C#). It can also not be the list of elements in the array, as the array contains zeros and the garbage collector has a cache of small Ints (in the range [-16,16]) that it would use instead of allocating (or actually instead of copying) a new constructor.

Also note that it takes about 24MB for the : constructors and 16MB for the I# constructors. Knowing that the former require 3 words and the latter 2 words, and that one word on my machine is 8 bytes long, we find that the list is 100000=10^6 elements long. So a very good candidate is the range of the second index parameter.

So where is this array? Let us trace your call to show:

showsIArray :: (IArray a e, Ix i, Show i, Show e) => Int -> a i e -> ShowS
showsIArray p a =
    showParen (p > 9) $
    showString "array " .
    shows (bounds a) .
    showChar ' ' .
    shows (assocs a)

(Code from Data.Array.Base). Clearly, the culprit must be in the assocs call, so here is the source for that:

assocs :: (IArray a e, Ix i) => a i e -> [(i, e)]
assocs arr = case bounds arr of
    (l,u) -> [(i, arr ! i) | i <- range (l,u)]

Since we are looking for the a list of indices, the range call looks suspicious enough. For that we have to look into the source of GHC.Arr (which unfortunately haddock messed up):

instance (Ix a, Ix b) => Ix (a, b) where
    range ((l1,l2),(u1,u2)) = [ (i1,i2) | i1 <- range (l1,u1), i2 <- range (l2,u2) ]

And now we have found the culprit: Here, range (l2,u2) will evaluate to the list [1..1000000] and shared for every step in the first component of the index.

Now I guess you’ll be curious in trying to put assocs instead of elems in the second code, and expecting the space leak there. But you won’t see it. The reason is that show is not inlined, but assocs itself gets inlined, and then also a whole bunch of other functions, including range effectively avoiding the sharing. You can see that (somewhat) by passing -ddump-rule-firings to GHC:

$ ghc --make -O2 -fspec-constr -ddump-rule-firings -fforce-recomp code2.hs -prof -fprof-auto
[1 of 1] Compiling Main             ( code2.hs, code2.o )
Rule fired: SPEC GHC.Arr.$fIx(,)
Rule fired: unpack
Rule fired: Class op fail
Rule fired: unpack
Rule fired: Class op show
Rule fired: Class op newArray_
Rule fired: unsafeFreeze/STUArray
Rule fired: Class op >>=
Rule fired: Class op >>=
Rule fired: Class op showList
Rule fired: Class op rangeSize
Rule fired: Class op rangeSize
Rule fired: Class op bounds
Rule fired: Class op bounds
Rule fired: Class op numElements
Rule fired: Class op index
Rule fired: Class op unsafeAt
Rule fired: Class op range
Rule fired: fold/build
Rule fired: SPEC GHC.Real.^
Rule fired: unpack-list
Rule fired: foldr/app
Rule fired: unpack-append
Rule fired: foldr/app
Rule fired: unpack-append
Rule fired: >#
Rule fired: >#
Rule fired: x# <=# x#
Rule fired: x# -# x#
Rule fired: 0# *# x#
Rule fired: 0# +# x#
Linking code2 ...
like image 87
Joachim Breitner Avatar answered Nov 02 '22 21:11

Joachim Breitner