Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimisations with folds

I am just curious if there are any (first order polymorphic only) optimisations with folds.

For maps, there's deforestation: map g (map f ls) => map (g . f) ls, and rev (map f ls) => rev_map f ls (faster in Ocaml).

But fold is so powerful, it seems to defy any optimisation.

like image 763
Yttrill Avatar asked Jan 31 '11 13:01

Yttrill


3 Answers

The obvious ones:

fold_left f acc (List.map g li) => fold_left (fun acc x -> f acc (g x)) acc li
fold_right f li acc => fold_left f acc li (* if (f,acc) is a monoid *)

You may be interested in the classical paper on the topic, "Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire". Beware however that it is technical and has impenetrable notation.

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.125

Edit: my first version of the first rule was wrong, edited thanks to vincent-hugot.

like image 54
gasche Avatar answered Oct 18 '22 00:10

gasche


You can use deforestation on folds. In fact, map/map fusion is a special case of that.

The trick is to replace list construction by a special build function:

build :: (forall b. (a -> b -> b) -> b -> b) -> [a]
build g = g (:) []

Now, using the standard definition of foldr

foldr :: (a -> b -> b) -> b -> [a] -> b
foldr c n []     = n
foldr c n (x:xs) = c x (foldr c n xs)

We have the following equivalence:

foldr c n (build g) == g c n

(Actually this is only true under certain, but common, circumstances. For details see "Correctness of short-cut fusion").

If you write your list producing functions (including map) using build and your consumers using foldr, then the above equality can remove most intermediate lists. Haskell's list comprehensions are translated into combinations of build and foldr.

The downside of this approach is that it cannot handle left folds. Stream Fusion handles this just fine, though. It expresses list producers and transformers as streams (coinductive datatypes, kind of like iterators). The above paper is very readable, so I recommend taking a look.

The "bananas" paper mentioned by gasche goes into more details about kinds of folds and their equivalences.

Finally, there is Bird and Moor's "Algebra of Programming", which mentions transformations such as combining two folds into one.

like image 44
nominolo Avatar answered Oct 18 '22 01:10

nominolo


If you're interested going a bit deeper into theory, I suggest you to read something about catamorphisms, anamorphisms and hylomorphisms. While the category theory surrounding it may seem to be a bit scary, the concept isn't that difficult.

Catamorphisms are functions that consume recursive data structures and produce some kind of a value. Anamorphisms are functions that given some value (a kind of a seed) produce recursive data structures. In particular, foldr and build mentioned in the other anwers are functions to build catamorphisms and anamorphisms on lists. But this concept can be applied to basically any recursive data structure, such as different kinds of trees etc.

Now if you build a recursive data structure with an anamorphism and then consume it with a catamorphism, you get what is called a hylomorphism. In such a case, you actually don't need the intermediate structure. You can skip creating it and destroying it. This is often called deforestation.


Concerning map: This function is interesting that it's both a catamorphism and an anamorphism:

  • map consumes a list and produces something; but also
  • map produces a list, consuming something.

So you can view the composition of two maps map f . map g as a composition of an anamorphism (map g) with a catamorphism (map f), forming a hylomorphism. So you know can optimize (deforest) by not creating the intermediate list.

To be specific: You could write map in two ways, one using foldr and the other using build:

mapAna :: (a -> b) -> [a] -> [b]
mapAna f xs = build (mapAna' f xs)

mapAna' :: (a -> b) -> [a] -> (b -> c -> c) -> c -> c
mapAna' f []       cons nil = nil
mapAna' f (x : xs) cons nil = (f x) `cons` (mapAna' f xs cons nil)


mapCata :: (a -> b) -> [a] -> [b]
mapCata f xs = foldr (\x ys -> f x : ys) [] xs

and the composition map f (map g zs) as mapCata f (mapAna g zs), which after some simplifications and applying foldr c n (build g) == g c n results in map (f . g).

like image 44
Petr Avatar answered Oct 18 '22 01:10

Petr