Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unexpected memory growth with Control.Monad foldM

Tags:

haskell

I have the following code, which has been stripped down and is I think as minimal as possible that has some very odd behaviour.

The code consists of two source files: One to define some data:

module MyFunction where

data MyFunction =
    MyFunction {
        functionNumber :: Int,
        functionResult :: IO String
        }

makeMyFunction :: Show a => Int -> IO a -> MyFunction
makeMyFunction number result = MyFunction {
    functionNumber = number,
    functionResult = result >>= return . show }

And the other is Main:

module Main (main) where

import System.CPUTime (getCPUTime)
import Data.List (foldl')
import Data.Foldable (foldlM)
import Control.Monad (foldM)
import MyFunction

exampleFunction = do
    --let x = foldl' (\a b -> a `seq` (a + b)) 0 [1..20000000]      -- This works
    --x <- foldlM (\a b -> a `seq` return (a + b)) 0 [1..20000000]  -- This works (*)
    x <- foldM (\a b -> a `seq` return (a + b)) 0 [1..20000000]    -- This doesn't
    print x
    return ()

runFunction fn = do
    result <- functionResult fn
    duration <- getCPUTime
    if result /= "()"
        then putStrLn ""
        else return ()
    putStrLn (show (fromIntegral duration / (10^9)) ++ "ms")
    return fn

main = do
    runFunction (makeMyFunction 123 exampleFunction)
    return ()

The code as above (compiled using GHC 7.10.3 with stack 1.0.0 with default flags) has a rapid increase in memory usage (exceeding 1GB), and takes typically 3.3 seconds.

If I make a changes to the code, for example:

  • Use one of the commented alternatives to the problem line
  • Take out any line from runFunction

The memory usage will remain minimal, and takes only about 1 second.

One feature that I think is most surprising to me is that replacing foldM with foldlM (which as far as I know foldM = foldlM) fixes the problem.

Also making changes to code that I don't see has any relationship to the problem lines of code also fixes the problem. For example removing the last putStrLn.

Another oddity is that if I merge the MyFunction module into the Main module, while it doesn't fix the problem, it actually causes foldlM to behave as foldM using excessive memory.

In the real code that this came from, I have a large number exampleFunctions, and there is significantly more Main code, and every so often I encounter this sort of unexplained memory usage from functions, that can usually be resolved by some sort of voodoo.

I'm looking for an explanation for the behaviour. If I know why this occurs I can then look into avoiding it. Could this be a compiler issue, or maybe just a misunderstanding on my part?

(*) I've highlighted the secondary issue that causes the same memory growth to occur with foldlM.

like image 378
pticawr Avatar asked Jan 16 '16 19:01

pticawr


1 Answers

Here is foldlM from Foldable.hs (ghc)

-- | Monadic fold over the elements of a structure,
-- associating to the left, i.e. from left to right.
foldlM :: (Foldable t, Monad m) => (b -> a -> m b) -> b -> t a -> m b
foldlM f z0 xs = foldr f' return xs z0
  where f' x k z = f z x >>= k

and foldM from Monad.hs

foldM          :: (Foldable t, Monad m) => (b -> a -> m b) -> b -> t a -> m b
{-# INLINEABLE foldM #-}
{-# SPECIALISE foldM :: (a -> b -> IO a) -> a -> [b] -> IO a #-}
{-# SPECIALISE foldM :: (a -> b -> Maybe a) -> a -> [b] -> Maybe a #-}
foldM          = foldlM

I placed these definitions to a separate module Test and tested the execution with and without INLINEABLE / SPESIALISE lines. Whatever the reason is, leaving out the SPECIALISE directives helped and the execution time and memory usage was like with foldlM.

After a little bit more digging, removing line

{-# SPECIALISE foldM :: (a -> b -> IO a) -> a -> [b] -> IO a #-}

effected the most.

like image 146
J.J. Hakala Avatar answered Nov 17 '22 15:11

J.J. Hakala