Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strange space behavior of Haskell program

I thought that the Cont monad is just equivalent to CPS Transformation, so if I have a monadic sum, if I run in the Identity monad, it will fail due to stack overflow, and if I run it in the Cont Monad, it will be okay due to tail recursion.

So I've written a simple program to verify my idea. But to my surprise, the result is unreasonable due to my limited knowledge.

All programs are compiled using ghc --make Test.hs -o test && ./test

sum0 n = if n==0  then  0  else n + sum0 (n-1)
sum1 n = if  n==0  then return 0 else sum1 (n-1) >>= \ v ->  seq v (return (n+v))
sum2 n k = if n == 0 then k 0 else sum2 n (\v -> k (n + v))
sum3 n k = if n == 0 then k 0 else sum3 n (\ !v -> k (n + v))
sum4 n k = if n == 0 then k 0 else sum4 n (\ v -> seq v ( k (n + v)))
sum5 n = if  n==0  then return 0 else sum5 (n-1) >>= \ v ->   (return (n+v)) 
  • main = print (sum0 3000000)
    Stack overflow. This is reasonable.

  • main = print (flip runCont id (sum1 3000000))
    Uses 180M memory, which is reasonable, but I am not clear why seq needed here, since its continuation is not applied until n goes to 0.

  • main = print (flip runCont id (sum5 3000000))
    Stack overflow. Why?

  • main = print (flip runCont (const 0) (sum1 3000000))
    Uses 130M memory. This is reasonable.

  • main = print (flip runCont (const 0) (sum5 3000000))
    Uses 118M memory. This is reasonable.

  • main = print (sum2 3000000 (const 0))
    Uses a lot of memory (more than 1G). I thought sum2 is equivalent to sum5 (when sum5 is in Cont monad). Why?

  • main = print (sum3 3000000 (const 0))
    Uses a lot of memory. I thought sum3 is equivalent to sum1 (Cont monad). Why?

  • main = print (runIdentity (sum1 3000000))
    Stack overflow, exactly what I want.

  • main = print (sum3 3000000 id)
    Uses a lot of memory. Equivalent to sum1, why?

  • main = print (sum4 3000000 id)
    Uses a lot of memory. Equivalent to sum1, why?

  • main = print (sum [1 .. 3000000])
    Stack overflow. The source of sum = foldl (+) 0, so this is reasonable.

  • main = print (foldl' (+) 0 [1 .. 3000000])
    Uses 1.5M.

like image 570
bobzhang Avatar asked Aug 23 '11 16:08

bobzhang


1 Answers

First of all, it looks to me like sum2, sum3, and sum4 never actually decrement n. So they're using lots of memory because they're going into an infinite loop that does allocation.

After correcting that, I've run each of your tests again with the following results, where "allocation" refers to approximate peak memory use:

  • main = print (sum0 3000000) : Stack overflow, after allocating very little memory
  • main = print (flip runCont id (sum1 3000000)) : Success, allocating similar amounts to what you saw
  • main = print (flip runCont id (sum5 3000000)) : Stack overflow, after allocating similar amounts of memory as sum1.
  • main = print (flip runCont (const 0) (sum1 3000000)) : Success, similar allocation as the above
  • main = print (flip runCont (const 0) (sum5 3000000)) : Same
  • main = print (sum2 3000000 (const 0)) : Success, about 70% as much allocation as sum1
  • main = print (sum3 3000000 (const 0)) : Success, about 50% as much allocation as sum1
  • main = print (runIdentity (sum1 3000000)) : Stack overflow, with little allocation
  • main = print (sum3 3000000 id) : Success, about 50% as much allocation as sum1
  • main = print (sum4 3000000 id) : Success, about 50% as much allocation as sum1
  • main = print (sum [1 .. 3000000]) : Stack overflow, with about 80% as much allocation as sum1
  • main = print (foldl' (+) 0 [1 .. 3000000]) : Success, with almost no allocation

So that's mostly what you expected, with the exception of why seq makes such a difference between sum1 vs. sum5.

like image 188
C. A. McCann Avatar answered Oct 03 '22 04:10

C. A. McCann