Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Monadic Fold in Constant Space

How can I fold a lazy list using a monadic action in constant space? The problem I'm trying to solve is aggregating a large file, and I believe for the sake of performance I require mutability. I have an implementation working in ST using mutable vectors, but it uses too much memory. Below is an example of what I'm attempting. I also experimented briefly with Conduit but that didn't appear to provide any improvement.

ST forM_:

import Control.Monad (forM_)
import Control.Monad.ST.Trans as STT
import Control.Monad.Identity as Identity

testST :: Int
testST = do
  Identity.runIdentity $ STT.runST $ do
    a <- STT.newSTRef 0
    forM_ [1..10000000] (\x -> do
        a' <- STT.readSTRef a
        STT.writeSTRef a (a' + x)
      )
    STT.readSTRef a

Conduit:

import Data.Conduit (($=),(=$),($$))
import qualified Data.Conduit as C
import qualified Data.Conduit.List as CL

testCL :: IO Int
testCL = CL.sourceList [1..10000000] $$ CL.foldM (\a x -> return (a + x)) 0
like image 647
ryachza Avatar asked Sep 21 '15 19:09

ryachza


1 Answers

The problem is not with the fold, but with the fold body. This program allocates a lot:

testST = runST $ do
    ref <- newSTRef 0
    forM_ [1..10000000] $ \x -> do
         val <- readSTRef ref
         writeSTRef ref (val + x)
    readSTRef ref

This program, whose only difference is on the writeSTRef line, allocates almost nothing:

testST = runST $ do
    ref <- newSTRef 0
    forM_ [1..10000000] $ \x -> do
        val <- readSTRef ref
        writeSTRef ref $! val + x
    readSTRef ref

The difference between the two pieces of code is a good hint to what's going on: in the former, you are creating a reference to a deeply-nested thunk with 10000000 layers of applications of +; whereas the latter flattens the thunk at each step.

By the way, this common pitfall is explicitly called out in the documentation for modifySTRef.

like image 59
Daniel Wagner Avatar answered Oct 24 '22 19:10

Daniel Wagner