Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is this Haskell program allocating so much memory?

Mind the following Haskell program:

-- (^) allocs memory so we define it using the native (**)
pow :: Int -> Int -> Int
pow x y = floor $ fromIntegral x ** fromIntegral y

-- tail recursive, div and mod are native, I believe, so, no alloc
isPalindrome :: Int -> Bool
isPalindrome x = go (pow 10 (floor $ logBase 10 $ fromIntegral x)) x
    where go m x = x <= 0 || div x m == mod x 10 && go (div m 100) (div (x - m * mod x 10) 10)

-- go is tail recursive too... no obvious allocation here
wanderer :: Int -> Int
wanderer n = go 0 (pow 10 n - 1) (pow 10 n - 1) where
    go p x y | p > 0 && div p x >= pow 10 n  = p                
    go p x y | p > 0 && y < div p x || y < x = go p (x-1) (pow 10 n - 1)
    go p x y | isPalindrome (x*y)            = go (x*y) x (y-1) 
    go p x y                                 = go p x (y-1)     

main = print . wanderer $ 8

Profiling it, we get:

Fri May  8 03:36 2015 Time and Allocation Profiling Report  (Final)

       aff +RTS -p -RTS

    total time  =        7.34 secs   (7344 ticks @ 1000 us, 1 processor)
    total alloc = 6,487,919,472 bytes  (excludes profiling overheads)

COST CENTRE     MODULE    %time %alloc

isPalindrome    Main       41.9   18.5
isPalindrome.go Main       22.6    1.4
wanderer.go     Main       20.0   67.8
pow             Main       15.5   12.3


                                                                individual     inherited
COST CENTRE           MODULE                  no.     entries  %time %alloc   %time %alloc

MAIN                  MAIN                     47           0    0.0    0.0   100.0  100.0
 main                 Main                     95           0    0.0    0.0     0.0    0.0
 CAF:main1            Main                     92           0    0.0    0.0     0.0    0.0
  main                Main                     94           1    0.0    0.0     0.0    0.0
 CAF:main2            Main                     91           0    0.0    0.0   100.0  100.0
  main                Main                     96           0    0.0    0.0   100.0  100.0
   wanderer           Main                     98           1    0.0    0.0   100.0  100.0
    pow               Main                    101           1    0.0    0.0     0.0    0.0
    wanderer.go       Main                     99    49995002   20.0   67.8   100.0  100.0
     isPalindrome     Main                    102    49985002   41.9   18.5    80.0   32.2
      pow             Main                    104    49985002   15.5   12.3    15.5   12.3
      isPalindrome.go Main                    103    52207117   22.6    1.4    22.6    1.4
     pow              Main                    100           1    0.0    0.0     0.0    0.0
   pow                Main                     97           2    0.0    0.0     0.0    0.0
 CAF                  GHC.Conc.Signal          85           0    0.0    0.0     0.0    0.0
 CAF                  GHC.IO.Encoding          78           0    0.0    0.0     0.0    0.0
 CAF                  GHC.IO.Encoding.Iconv    76           0    0.0    0.0     0.0    0.0
 CAF                  GHC.IO.Handle.FD         69           0    0.0    0.0     0.0    0.0
 CAF                  GHC.Event.Thread         55           0    0.0    0.0     0.0    0.0

As far as I'm aware, it seems like all my functions are tail recursive and those prelude functions are asm operations. Yet this simple program allocs 7gb of memory. Where is all the allocation coming from?

like image 231
MaiaVictor Avatar asked May 08 '15 06:05

MaiaVictor


People also ask

Why does Haskell use so much memory?

Haskell computations produce a lot of memory garbage - much more than conventional imperative languages. It's because data are immutable so the only way to store every next operation's result is to create new values. In particular, every iteration of a recursive computation creates a new value.

Is Haskell garbage collected?

The Haskell runtime system employs a generational garbage collector (GC) with two generations 2. Generations are numbered starting with the youngest generation at zero.

Why does Haskell need garbage collection?

The answer is that a GC is required under the hood to reclaim the heap objects that the language MUST create. For example. A pure function needs to create heap objects because in some cases it has to return them. That means that they can't be allocated on the stack.

Does Haskell have a heap?

By default, Haskell heap objects are stored “boxed” — they are represented as a pointer to an object on the heap. Unboxed objects on the other hand are represented as the actual raw data. In some languages these would also be referred to as reference and value types.


1 Answers

The allocation is coming from the go in isPalindrome:

go m x = x <= 0 || div x m == mod x 10 && go (div m 100) (div (x - m * mod x 10) 10)

We have a || on the right hand side. The short-circuit semantics of || is realized through lazy evaluation. GHC sees that the m argument is unused if x <= 0 evaluates to True, so it doesn't unbox m, allowing it to remain uncomputed. Of course, in this case we're better off unboxing m too, so let's do that.

{-# LANGUAGE BangPatterns #-}
go !m x = ...

Now with ghc -O2 and +RTS -s:

      52,016 bytes allocated in the heap
       3,408 bytes copied during GC
      44,312 bytes maximum residency (1 sample(s))
      17,128 bytes maximum slop
           1 MB total memory in use (0 MB lost due to fragmentation)
like image 120
András Kovács Avatar answered Nov 15 '22 09:11

András Kovács