Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use the fold function in Haskell with other datatypes

Is there an universal way of thinking on how to create a fold function for a new data type?

For example, the fold function for the data Tree is:

data Tree t = Leaf | Node t (Tree t) (Tree t)
              deriving (Eq,Ord,Show)

treeFold:: (a -> b -> b -> b) -> b -> Tree a -> b
treeFold f e Leaf = e
treeFold f e (Node x l r) = f x (treeFold f e l) (treeFold f e r)

For example, how would I have to create the fold function for the following data?

data Json a = Val a | Obj [(String, Json a)]

I know the type would have to contain 2 functions, one for each ot the cases Val and Obj. What do I have to consider while creating the fold? I hope my question makes sense. I've just came across many different datatypes where it was asked to write a fold function for a data type, and I don't seem to find the pattern.

like image 458
SavannahGemp Avatar asked Oct 16 '22 13:10

SavannahGemp


People also ask

How does fold work in Haskell?

In functional programming, fold (or reduce) is a family of higher order functions that process a data structure in some order and build a return value. This is as opposed to the family of unfold functions which take a starting value and apply it to a function to generate a data structure.

How data types are combined in Haskell?

You can combine multiple types with an and (for example, a name is a String and another String ), or you can combine types with an or (for example, a Bool is a True data constructor or a False data constructor). Types that are made by combining other types with an and are called product types.

What does the fold function do?

In functional programming, fold (also termed reduce, accumulate, aggregate, compress, or inject) refers to a family of higher-order functions that analyze a recursive data structure and through use of a given combining operation, recombine the results of recursively processing its constituent parts, building up a ...


1 Answers

As Willem Van Onsem pointed out in a (now-deleted) comment, what you are trying to implement is also called a catamorphism. I've written some about what I suppose you might call a beginner's view of catamorphisms, at Does each type have a unique catamorphism?. You can derive the catamorphism for a type (or show that none can exist) quite mechanically. If your type has N constructors, the fold function must take N+1 arguments: one value of your type, and one function for each constructor. Each such function takes one argument per field that its corresponding constructor has (or, if the constructor has no fields, it takes an ordinary value, which you can imagine as a 0-ary function), and returns a value of whatever type the catamorphism returns.

It's complicated in words, so I'll copy the relevant code from the answer I linked above, as an exemplar:

data X a b f = A Int b
             | B
             | C (f a) (X a b f)
             | D a

xCata :: (Int -> b -> r)
      -> r
      -> (f a -> r -> r)
      -> (a -> r)
      -> X a b f
      -> r
xCata a b c d v = case v of
  A i x -> a i x
  B -> b
  C f x -> c f (xCata a b c d x)
  D x -> d x

Observe that each of the functions (a, b, c, d) has one argument per field in the associated constructor. In most of the cases, you simply call the function with each of the constructor's fields...but what's up with the C case? Why don't we write c f x instead of c f (xCata a b c d x)? This is where the recursion happens: cata's job is to recursively traverse (fold) the entire tree represented by your ADT, turning each X a b f value into a result of type r. Happily, there's only one possible way to do that transformation: call xCata with the same set of functions you were passed to begin with.

like image 55
amalloy Avatar answered Nov 15 '22 10:11

amalloy