Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the properties of the unsided fold?

Foldl and folr are 2 very important functions for FP and Haskell, but I have never heard much about the unsided fold:

fold f [a,b,c,d] = (f (f a b) (f c d))

That is, a fold that operates on binary associative functions (so the order of application doesn't matter). If I recall correctly, this is very common in databases as it can be parallelized. So, about it, I ask:

  1. Is it, like foldr, universal?
  2. Like foldr, can you define every important function using it?
  3. Is there a fusion rule for it, similar to those for foldr/build and unfoldr/destroy?
  4. Why is it barely mentioned?
  5. Any consideration worth mentioning?
like image 997
MaiaVictor Avatar asked Jan 06 '14 01:01

MaiaVictor


2 Answers

This is often thought of as tree reduction and is important in parallel computation since it embodies divide and conquer reduction.

First, if the combining function is non-associative then obviously there are big differences between foldl, foldr, and "unsided fold", so let's assume that we combine with an associative operation. Immediately, all folds can be represented by a Monoid.

foldlm :: Monoid m => [m] -> m
foldlm = foldl mappend mempty

foldrm :: Monoid m => [m] -> m
foldrm = foldr mappend mempty

usfoldm :: Monoid m => [m] -> m
usfoldm = foldTree mappend mempty . buildTree

Which is better represented by foldMap :: Monoid m => (a -> m) -> [a] -> m which is defined by default using foldr.

foldMap f = foldr (mappend . f) mempty

Which is sufficient, if given a final extraction step, to produce tree-like unsided folding given a Monoid defined on a tree-like sequence type that controls how the element-Monoids are combined.

data Tree a
singleton :: a -> Tree a
instance Monoid (Tree a) where ...
foldTree :: Monoid a => Tree a -> a

foldTree . foldMap singleton :: Monoid a => [a] -> a

Finally, we've seen that we can get foldMap from foldr, but we can also get foldr from foldMap

newtype Endo a = Endo { appEndo :: a -> a }

instance Monoid (Endo a) where
  mempty = id
  mappend (Endo f) (Endo g) = Endo (f . g)

foldr f z as = appEndo (foldMap (Endo . f) as) z

Generally, foldMap is considered to be more primitive since it lets the underlying Monoid choose its preferred folding method. This means that we're free to write more efficient or more parallel folds on a per-datatype level, though it can still be challenging to do so properly.

It's worth noting that the foldMap abstraction is usually found as a instance method of Foldable which is a very popular, but more new Haskell typeclass. It's also considered to be a little bit silly despite its practical usefulness because Foldable has very few meaningful laws excepting that

toList :: Foldable f => f a -> [a]

exists, which also lets us see the Monoidal nature of foldMap as [a] is the universal Monoid which we can recover with foldr.

For further investigation of fusion rules, it would be valuable to read about a proposed dual typeclass Buildable as in Gershom Bazerman's Building up to a Point via Adjunctions.

And finally, as for popularity, I think it's definitely the preferred method of instantiating Foldable these days since it allows for more efficient Monoid folds if necessary, but it's definitely newer than both foldl and foldr which likely plays into its relative obscurity.

like image 146
J. Abrahamson Avatar answered Sep 28 '22 01:09

J. Abrahamson


The function you're looking for is Data.Foldable.foldMap.

foldMap :: Data.Monoid.Monoid m => (a -> m) -> t a -> m

This function is isomorphic to foldr. As a proof, note that one of foldMap or foldr is a minimal complete definition of Foldable, meaning that each can be written in terms of the other. That answers your first two questions affirmatively.

I don't know of fusion rules for foldMap specifically, but I'm sure that one could exist. At the very least, foldr fusion rules should apply to some extent.

I have no idea why it's barely mentioned.

One consideration worth mentioning is that, as far as lists are concerned, you can't always take full advantage of this fold. Since lists are constructed out of cons cells, doing a tree fold would mean traversing half the list, then recursing down each half and traversing half again, etc. This is a lot of extra traversals compared to foldl or foldr. For non-list structures a tree fold can be much more efficient, and even for lists it's possible to take some advantage of this. There was a nice blog about one such task recently.

like image 22
John L Avatar answered Sep 28 '22 02:09

John L