Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to fix space leak caused by laziness when your algorithm relies on laziness

I have some algorithm which generates a search tree:

data SearchTree a = Solution a | Contradiction | Search [ SearchTree a ]
  deriving (Show, Functor)

The algorithm generates this tree lazily. I also defined a simple evaluator, which is really just depth first search.

simpleEval :: MonadPlus m => SearchTree a -> m a
simpleEval (Solution a) = return a 
simpleEval Contradiction = mzero 
simpleEval (Search ps) = foldr mplus mzero $ map simpleEval ps 

I noticed that many of the solutions that my algorithm produces look something like the following search tree:

nest :: Int -> SearchTree a -> SearchTree a 
nest 0 = id 
nest n = nest (n-1) . Search . (:[]) 

tree0 = Search ts where 
  ts = cycle $ t0 : replicate 100 t1 ++ [t2]
  t0 = nest 100 $ Solution 'a' 
  t1 = nest 1000 $ Contradiction 
  t2 = nest 4 $ Solution 'b' 

Namely, they have a lot of very deep branches with no solutions, a few deep branches with a solution, and very few shallow branches with a solution. On this basis, I decided I wanted another evaluator, one which will 'give up' on branches that are too deep. Call it cutoffEval. cutoffEval 5 tree0 should find only bs because it there are infinitely many branches of depth less than 5 to consider, and they only contain bs. I implemented it like so:

cutoff :: (MonadPlus m) => Int -> SearchTree a -> (m a, [SearchTree a])
cutoff cu = go cu where 
  plus ~(m0, l0) ~(m1, l1) = (mplus m0 m1, l0 ++ l1)
  zero = (mzero, [])

  go 0 x             = (mzero, [x])
  go _ Contradiction = zero
  go _ (Solution a)  = (return a, [])
  go d (Search ps)   = foldr plus zero $ map (go $ d-1) ps

cutoffEval :: MonadPlus m => Int -> SearchTree a -> m a
cutoffEval cu = go where 
  go t = case cutoff cu t of (r,ts) -> foldr mplus mzero $ r : map go ts 

But this function produces a huge space leak, as compared to simpleEval:

putStrLn $ take 4000 $ simpleEval tree0  -- 2MB residency
putStrLn $ take 4000 $ cutoffEval 10 tree0  -- 600MB residency

Profiling reveals that almost all allocation occurs in cutoff.go; and the majority of allocation is due something mysterious called main:Tree.sat_s5jg and the (,) constructor. This seems to me due to the irrefutable patterns, the tuple constructors are being built up as thunks instead of being forced by plus; and normally the solution to a space leak is to make your function stricter, but here removing the irrefutable patterns causes cutoff to hang, so I cannot do it.

I tested this with GHC 7.6, 7.8 and 7.10. The problem was found in each one.

So my questions are: Can cutoffEval be written to run in constant space like simpleEval? And more generally, how do I fix a space leak, if I can't make my implementation stricter because the algorithm depends on it?

like image 563
user2407038 Avatar asked Oct 16 '15 23:10

user2407038


1 Answers

It seems to me that the reason for the memory leak is actually a bug in the implementation. Your cutoff function mixes together cutting off too deep branches with evaluating the upper part. And then in cutoffEval, you go deeper to the bottom, cut branches, and continue to explore them recursively. Which is essentially breadth-first search, going by cu levels in each pass. This means that the whole tree will eventually be constructed and retained in memory until the end! (Unlike the case of a depth-first search, where visited subtrees can be reclaimed by the GC.)

If you want to just cut off the branches that are too deep, getting the first part of the result of cutoff is what you want.

In any case I'd suggest to separate the evaluator and the cut-off part (see below). In such a case you can just use the original evaluator on the cut-off version of the tree.

One additional remark, from the MonadPlus constraint you use just the monoidal part - mzero and mplus. It'd be cleaner and more generic to use just Monoid. There are more monoids than monads (for example Sum to just count solutoins, or Last to find the last solution).

simpleEval :: (Monoid m) => (a -> m) -> SearchTree a -> m
simpleEval f = go
  where
    go (Solution a) = f a
    go Contradiction = mempty
    go (Search ps) = mconcat $ map go ps 

cutoff :: Int -> SearchTree a -> SearchTree a
cutoff cu = go cu
  where 
    go 0 _             = Contradiction -- too deep branches are just failures
    go d (Search ps)   = Search $ map (go (d - 1)) ps
    go _ x             = x

cutoffEval :: (Monoid m) => Int -> (a -> m) -> SearchTree a -> m
cutoffEval cu f = simpleEval f . cutoff cu
like image 60
Petr Avatar answered Nov 07 '22 00:11

Petr