Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to define foldM using foldr/foldl (if it is possible)?

I wanted to make a generic function that folds over a wide range of inputs (see Making a single function work on lists, ByteStrings and Texts (and perhaps other similar representations)). As one answer suggested, the ListLike is just for that. Its FoldableLL class defines an abstraction for anything that is foldable. However, I need a monadic fold. So I need to define foldM in terms of foldl/foldr.

So far, my attempts failed. I tried to define

foldM'' :: (Monad m, LL.FoldableLL full a) => (b -> a -> m b) -> b -> full -> m b
foldM'' f z = LL.foldl (\acc x -> acc >>= (`f` x)) (return z)

but it runs out of memory on large inputs - it builds a large unevaluated tree of computations. For example, if I pass a large text file to

main :: IO ()
main = getContents >>= foldM'' idx 0 >> return ()
  where
    -- print the current index if 'a' is found
    idx !i 'a' = print i >> return (i + 1)
    idx !i _   =            return (i + 1)

it eats up all memory and fails.

I have a feeling that the problem is that the monadic computations are composed in a wrong order - like ((... >>= ...) >>= ...) instead of (... >>= (... >>= ...)) but so far I didn't find out how to fix it.


Workaround: Since ListLike exposes mapM_, I constructed foldM on ListLikes by wrapping the accumulator into the state monad:

modifyT :: (Monad m) => (s -> m s) -> StateT s m ()
modifyT f = get >>= \x -> lift (f x) >>= put

foldLLM :: (LL.ListLike full a, Monad m) => (b -> a -> m b) -> b -> full -> m b
foldLLM f z c = execStateT (LL.mapM_ (\x -> modifyT (\b -> f b x)) c) z

While this works fine on large data sets, it's not very nice. And it doesn't answer the original question, if it's possible to define it on data that are just FoldableLL (without mapM_).

like image 244
Petr Avatar asked Feb 19 '23 20:02

Petr


1 Answers

So the goal is to reimplement foldM using either foldr or foldl. Which one should it be? We want the input to be processed lazily and allow for infinte lists, this rules out foldl. So foldr is it going to be.

So here is the definition of foldM from the standard library.

foldM             :: (Monad m) => (a -> b -> m a) -> a -> [b] -> m a
foldM _ a []      =  return a
foldM f a (x:xs)  =  f a x >>= \fax -> foldM f fax xs

The thing to remember about foldr is that its arguments simply replace [] and : in the list (ListLike abstracts over that, but it still serves as a guiding principle).

So what should [] be replaced with? Clearly with return a. But where does a come from? It won’t be the initial a that is passed to foldM – if the list is not empty, when foldr reaches the end of the list, the accumulator should have changed. So we replace [] by a function that takes an accumulator and returns it in the underlying monad: \a -> return a (or simply return). This also gives the type of the thing that foldr will calculate: a -> m a.

And what should we replace : with? It needs to be a function b -> (a -> m a) -> (a -> m a), taking the first element of the list, the processed tail (lazily, of course) and the current accumulator. We can figure it out by taking hints from the code above: It is going to be \x rest a -> f a x >>= rest. So our implementation of foldM will be (adjusting the type variables to match them in the code above):

foldM'' :: (Monad m) => (a -> b -> m a) -> a -> [b] -> m a
foldM'' f z list = foldr (\x rest a -> f a x >>= rest) return list z

And indeed, now your program can consume arbitrary large input, spitting out the results as you go.

We can even prove, inductively, that the definitions are semantically equal (although we should probably do coinduction or take-induction to cater for infinite lists).

We want to show

foldM f a xs = foldM'' f a xs

for all xs :: [b]. For xs = [] we have

  foldM f a []
≡ return a     -- definition of foldM
≡ foldr (\x rest a -> f a x >>= rest) return [] a -- definition of foldr
≡ foldM'' f a [] -- definition of foldM''

and, assuming we have it for xs, we show it for x:xs:

  foldM f a (x:xs)
≡ f a x >>= \fax -> foldM f fax xs --definition of foldM
≡ f a x >>= \fax -> foldM'' f fax xs -- induction hypothesis
≡ f a x >>= \fax -> foldr (\x rest a -> f a x >>= rest) return xs fax -- definition of foldM''
≡ f a x >>= foldr (\x rest a -> f a x >>= rest) return xs -- eta expansion
≡ foldr (\x rest a -> f a x >>= rest) return (x:xs) -- definition of foldr
≡ foldM'' f a (x:xs) -- definition of foldM''

Of course this equational reasoning does not tell you anything about the performance properties you were interested in.

like image 99
Joachim Breitner Avatar answered Mar 06 '23 04:03

Joachim Breitner