Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why changing the Data.Binary.Put monad into a transformer creates a memory leak?

I'm trying to modify the Data.Binary.PutM monad into a monad transformer. So I started by changin it's definition from

newtype PutM a = Put { unPut :: PairS a }

to

newtype PutM a = Put { unPut :: Identity (PairS a) }

Then of course I changed the implementations of return and >>= functions:

From:

return a = Put $ PairS a mempty
{-# INLINE return #-}

m >>= k  = Put $
    let PairS a w  = unPut m
        PairS b w1 = unPut (k a)
    in PairS b (w `mappend` w1)
{-# INLINE (>>=) #-}

m >> k  = Put $
    let PairS _ w  = unPut m
        PairS b w1 = unPut k
    in PairS b (w `mappend` w1)
{-# INLINE (>>) #-}

To:

return a = Put $! return $! PairS a mempty
{-# INLINE return #-}

m >>= k  = Put $!
    do PairS a w  <- unPut m
       PairS b w1 <- unPut (k a)
       return $! PairS b $! (w `mappend` w1)
{-# INLINE (>>=) #-}

m >> k  = Put $!
    do PairS _ w  <- unPut m
       PairS b w1 <- unPut k
       return $! PairS b $! (w `mappend` w1)
{-# INLINE (>>) #-}

As if the PutM monad was just a Writer monad. Unfortunately this (again) created a space leak. It is clear to me (or is it?) that ghc is postponing evaluation somewhere but I tried to put $! instead of $ everywhere as suggested by some tutorials but that did not help. Also, I'm not sure how the memory profiler is helpful if what it shows me is this:

Memory profile.

And for completeness, this is the memory profile I get when using the original Data.Binary.Put monad:

Original memory profile

If interested, here is the code I'm using to test it and the line I'm using to compile, run and create the memory profile is:

ghc -auto-all -fforce-recomp -O2 --make test5.hs && ./test5 +RTS -hT && hp2ps -c test5.hp && okular test5.ps

I hope I'm not annoying anyone by my saga of memory leak questions. I find there isn't many good resources on internet about this topic which leaves a newbye clueless.

Thanks for looking.

like image 590
Peter Jankuliak Avatar asked Mar 09 '11 11:03

Peter Jankuliak


1 Answers

As stephen tetley pointed out in his comment, the problem here is in excessive strictness. If you just add some more laziness to your Identity sample (~(PairS b w') in your (>>) definition) you'll get the same constant memory run picture:

data PairS a = PairS a {-# UNPACK #-}!Builder

sndS :: PairS a -> Builder
sndS (PairS _ !b) = b

newtype PutM a = Put { unPut :: Identity (PairS a) }

type Put = PutM ()

instance Monad PutM where
    return a = Put $! return $! PairS a mempty
    {-# INLINE return #-}

    m >>= k  = Put $!
        do PairS a w  <- unPut m
           PairS b w' <- unPut (k a)
           return $! PairS b $! (w `mappend` w')
    {-# INLINE (>>=) #-}

    m >> k  = Put $!
        do PairS _ w  <- unPut m
           ~(PairS b w') <- unPut k
           return $! PairS b $! (w `mappend` w')
    {-# INLINE (>>) #-}

tell' :: Builder -> Put
tell' b = Put $! return $! PairS () b

runPut :: Put -> L.ByteString
runPut = toLazyByteString . sndS . runIdentity . unPut

You actually can use normal tuples here and $ instead of $!

PS Once again: the right answer is actually in stephen tetley comment. The thing is that your 1st example uses lazy let bindings for >> implementation, so the Tree is not forced to be built entirely and hence "is streamed". Your 2nd Identity example is strict, so my understanding is that the whole Tree gets built in memory before being processed. You can actually easily add strictness to the 1st example and observe how it starts 'hogging' memory:

m >> k  = Put $
          case unPut m of
            PairS _ w ->
                case unPut k of
                  PairS b w' ->
                      PairS b (w `mappend` w')
like image 187
Ed'ka Avatar answered Sep 23 '22 19:09

Ed'ka