Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Stack overflow in monoidal fold over large list

First some imports,

import Control.Applicative
import Data.Traversable as T
import Data.Foldable as F
import Data.Monoid

Say I have a functor holding a pair of values,

data Fret a = Fret a a deriving (Show)

instance Functor Fret where fmap f (Fret a b) = Fret (f a) (f b)

instance Applicative Fret where
    pure a = Fret a a
    Fret aa ab <*> Fret ba bb = Fret (aa ba) (ab bb)

instance Monoid a => Monoid (Fret a) where
    mempty = Fret mempty mempty
    a `mappend` b = mappend <$> a <*> b

I have a large list of these,

frets = replicate 10000000 (Fret 1 2)

over which I want to compute a, e.g., an average,

data Average a = Average !Int !a deriving (Read, Show)

instance Num a => Monoid (Average a) where
    mempty = Average 0 0
    Average n a `mappend` Average m b = Average (n+m) (a+b)

runAverage :: Fractional a => Average a -> a
runAverage (Average n a) = a / fromIntegral n

average = Average 1

Here are a few potential implementations of this,

average1 = runAverage <$> foldMap (fmap average) frets

average2 = pure (runAverage . mconcat) <*> T.sequenceA (map (pure (Average 1) <*>) frets)

Unfortunately, all of these result in a stack overflow.

Thinking that the problem might be excessive laziness in Foldable.foldMap, I tried implementing a stricter variant,

foldMap' :: (F.Foldable f, Monoid m) => (a -> m) -> f a -> m
foldMap' f = F.foldl' (\m a->mappend m $! f a) mempty

average3 = runAverage <$> foldMap' (fmap average) frets

Unfortunately, this too overflows.

How can one accomplish this without compromise the clean structure of the approach?

Update

If I make the fields of Fret strict, things appear to work as expected. Checking to see if this works in the larger application.

like image 481
bgamari Avatar asked Feb 23 '13 22:02

bgamari


1 Answers

Looks like foldMap is too lazy, and your Fret data type certainly is, leading to a classic foldl (+) type space leak, where you accumulate a large chain of thunks trying to reduce your innput list to its average. It is analogous to the space leaks in list average with tuples.

Clearly the accumulator in your only loop is too lazy -- the only place you use the stack is in foldMap

enter image description here

Using the same solution - a strict pair type for Frets and foldl' implementation of foldMap is enough, and it will run in constant space:

 foldMap' f = F.foldl' (\m -> mappend m . f) mempty

and

 data Fret a = Fret !a !a

enter image description here

like image 131
Don Stewart Avatar answered Oct 27 '22 18:10

Don Stewart