Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do nested MaybeT's cause exponential allocation

I have a program.

import Control.Monad
import Control.Monad.Identity
import Control.Monad.Trans.Maybe

import System.Environment

tryR :: Monad m => ([a] -> MaybeT m [a]) -> ([a] -> m [a])
tryR f x = do
  m <- runMaybeT (f x)
  case m of
    Just t -> return t
    Nothing -> return x

check :: MonadPlus m => Int -> m Int
check x = if x `mod` 2 == 0 then return (x `div` 2) else mzero

foo :: MonadPlus m => [Int] -> m [Int]
foo [] = return []
foo (x:xs) = liftM2 (:) (check x) (tryR foo xs)


runFoo :: [Int] -> [Int]
runFoo x = runIdentity $ tryR foo x

main :: IO ()
main = do
  [n_str] <- getArgs
  let n = read n_str :: Int
  print $ runFoo [2,4..n]

The main interesting thing about this program is that it can have many nested layers of MaybeT's. Here, doing so serves absolutely no purpose, but it did in the original program where I encountered this problem.

Care to take a guess of the time complexity of this program?

Okay, you cheated by reading the title of this question. Yes, it's exponential:

[jkoppel@dhcp-18-189-103-38:~/tmp]$ time ./ExpAlloc 50                                                                                                                                        (03-31 17:15)
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25]
./ExpAlloc 50  8.10s user 0.06s system 99% cpu 8.169 total
[jkoppel@dhcp-18-189-103-38:~/tmp]$ time ./ExpAlloc 52                                                                                                                                        (03-31 17:15)
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26]
./ExpAlloc 52  16.10s user 0.12s system 99% cpu 16.227 total
[jkoppel@dhcp-18-189-103-38:~/tmp]$ time ./ExpAlloc 54                                                                                                                                        (03-31 17:16)
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27]
./ExpAlloc 54  32.32s user 0.23s system 99% cpu 32.561 total

Some further inspection shows the reason is because it allocates an exponential amount of memory, which naturally takes an exponential amount of time:

[jkoppel@dhcp-18-189-103-38:~/tmp]$ time ./ExpAlloc 40 +RTS -s                                                                                                                                (03-31 17:17)
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
     939,634,520 bytes allocated in the heap
       5,382,816 bytes copied during GC
          75,808 bytes maximum residency (2 sample(s))
          66,592 bytes maximum slop
               2 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      1796 colls,     0 par    0.008s   0.009s     0.0000s    0.0000s
  Gen  1         2 colls,     0 par    0.000s   0.000s     0.0001s    0.0001s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    0.243s  (  0.246s elapsed)
  GC      time    0.008s  (  0.009s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time    0.252s  (  0.256s elapsed)

  %GC     time       3.2%  (3.6% elapsed)

  Alloc rate    3,869,930,149 bytes per MUT second

  Productivity  96.8% of total user, 95.3% of total elapsed

./ExpAlloc 40 +RTS -s  0.25s user 0.00s system 98% cpu 0.260 total
[jkoppel@dhcp-18-189-103-38:~/tmp]$ time ./ExpAlloc 42 +RTS -s                                                                                                                                (03-31 17:17)
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]
   1,879,159,424 bytes allocated in the heap
      10,767,048 bytes copied during GC
          95,504 bytes maximum residency (3 sample(s))
          71,152 bytes maximum slop
               2 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      3593 colls,     0 par    0.016s   0.018s     0.0000s    0.0000s
  Gen  1         3 colls,     0 par    0.000s   0.000s     0.0001s    0.0001s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    0.493s  (  0.498s elapsed)
  GC      time    0.016s  (  0.018s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time    0.510s  (  0.517s elapsed)

  %GC     time       3.1%  (3.5% elapsed)

  Alloc rate    3,810,430,292 bytes per MUT second

  Productivity  96.8% of total user, 95.7% of total elapsed

./ExpAlloc 42 +RTS -s  0.51s user 0.01s system 99% cpu 0.521 total
[jkoppel@dhcp-18-189-103-38:~/tmp]$ time ./ExpAlloc 44 +RTS -s                                                                                                                                (03-31 17:17)
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22]
   3,758,208,408 bytes allocated in the heap
      21,499,312 bytes copied during GC
         102,056 bytes maximum residency (5 sample(s))
          73,784 bytes maximum slop
               2 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      7186 colls,     0 par    0.032s   0.037s     0.0000s    0.0009s
  Gen  1         5 colls,     0 par    0.000s   0.001s     0.0001s    0.0001s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    0.979s  (  0.987s elapsed)
  GC      time    0.033s  (  0.038s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time    1.013s  (  1.024s elapsed)

  %GC     time       3.2%  (3.7% elapsed)

  Alloc rate    3,840,757,815 bytes per MUT second

  Productivity  96.7% of total user, 95.6% of total elapsed

./ExpAlloc 44 +RTS -s  1.01s user 0.01s system 99% cpu 1.029 total

I cannot for the life of me figure out why it does this. I'd appreciate any light people could shed on the situation.

like image 717
James Koppel Avatar asked Mar 31 '17 21:03

James Koppel


1 Answers

The transformers package (currently at version 0.5.4.0) implements MonadTrans using liftM:

lift :: Monad m => m a -> MaybeT m a
lift = MaybeT . liftM Just

where liftM is a combinator defined as

liftM :: Monad m => (a -> b) -> m a -> m b
liftM f m = m >>= \a -> return (f a)

Furthermore, return is defined for MaybeT as

return :: Monad m => a -> MaybeT m a
return a = lift . return

Reduce the definition:

return :: Monad m => a -> MaybeT m a
return a = MaybeT (return a >>= \a -> return (Just a))

where the two inner return are instantiated at type m.

One call to return @(MaybeT m) calls return @m twice, hence the exponential behavior you observe for a tower of MaybeT.

This is fixable by using fmap instead of liftM, but this is backwards incompatible, when Functor was not a superclass of Monad.

EDIT: Other transformers do not have this issue, as return is not defined using lift, which provides an even better fix.

return = MaybeT . return . Just

Here is a more minimal test case:

{-# LANGUAGE RankNTypes, ScopedTypeVariables #-}
import Control.Monad.Trans.Maybe
import System.Environment

f :: forall m proxy. Monad m => proxy m -> Int -> ()
f _ 0 = (return () :: m ()) `seq` ()
f _ n = f (undefined :: proxy (MaybeT m)) (n - 1)

main = do
  n : _ <- getArgs
  f (undefined :: proxy []) (read n) `seq` return ()

Output

> for i in {18..21} ; time ./b $i
./b $i  0.35s user 0.04s system 99% cpu 0.390 total
./b $i  0.71s user 0.07s system 99% cpu 0.782 total
./b $i  1.38s user 0.18s system 99% cpu 1.565 total
./b $i  2.82s user 0.32s system 100% cpu 3.139 total
like image 134
Li-yao Xia Avatar answered Nov 17 '22 23:11

Li-yao Xia