Is there an universal way of thinking on how to create a fold function for a new data type? For example, the fold function for the data Tree is: <pre class="prettyprint"><code>data Tree t = Leaf | Node t (Tree t) (Tree t) deriving (Eq,Ord,Show) treeFold:: (a -> b -> b -> b) -> b -> Tree a -> b treeFold f e Leaf = e treeFold f e (Node x l r) = f x (treeFold f e l) (treeFold f e r) </code></pre> For example, how would I have to create the fold function for the following data? <pre class="prettyprint"><code>data Json a = Val a | Obj [(String, Json a)] </code></pre> I know the type would have to contain 2 functions, one for each ot the cases Val and Obj. What do I have to consider while creating the fold? I hope my question makes sense. I've just came across many different datatypes where it was asked to write a fold function for a data type, and I don't seem to find the pattern.

As Willem Van Onsem pointed out in a (now-deleted) comment, what you are trying to implement is also called a catamorphism. I've written some about what I suppose you might call a beginner's view of catamorphisms, at Does each type have a unique catamorphism?. You can derive the catamorphism for a type (or show that none can exist) quite mechanically. If your type has N constructors, the fold function must take N+1 arguments: one value of your type, and one function for each constructor. Each such function takes one argument per field that its corresponding constructor has (or, if the constructor has no fields, it takes an ordinary value, which you can imagine as a 0-ary function), and returns a value of whatever type the catamorphism returns. It's complicated in words, so I'll copy the relevant code from the answer I linked above, as an exemplar: <pre class="prettyprint"><code>data X a b f = A Int b | B | C (f a) (X a b f) | D a xCata :: (Int -> b -> r) -> r -> (f a -> r -> r) -> (a -> r) -> X a b f -> r xCata a b c d v = case v of A i x -> a i x B -> b C f x -> c f (xCata a b c d x) D x -> d x </code></pre> Observe that each of the functions (a, b, c, d) has one argument per field in the associated constructor. In most of the cases, you simply call the function with each of the constructor's fields...but what's up with the C case? Why don't we write <code>c f x</code> instead of <code>c f (xCata a b c d x)</code>? This is where the recursion happens: <code>cata</code>'s job is to recursively traverse (fold) the entire tree represented by your ADT, turning each <code>X a b f</code> value into a result of type <code>r</code>. Happily, there's only one possible way to do that transformation: call <code>xCata</code> with the same set of functions you were passed to begin with.

How to use the fold function in Haskell with other datatypes

Tags:

haskell

algebraic-data-types

fold

Is there an universal way of thinking on how to create a fold function for a new data type?

For example, the fold function for the data Tree is:

data Tree t = Leaf | Node t (Tree t) (Tree t)
              deriving (Eq,Ord,Show)

treeFold:: (a -> b -> b -> b) -> b -> Tree a -> b
treeFold f e Leaf = e
treeFold f e (Node x l r) = f x (treeFold f e l) (treeFold f e r)

For example, how would I have to create the fold function for the following data?

data Json a = Val a | Obj [(String, Json a)]

I know the type would have to contain 2 functions, one for each ot the cases Val and Obj. What do I have to consider while creating the fold? I hope my question makes sense. I've just came across many different datatypes where it was asked to write a fold function for a data type, and I don't seem to find the pattern.

458

asked Oct 16 '22 13:10

SavannahGemp

1 Answers

As Willem Van Onsem pointed out in a (now-deleted) comment, what you are trying to implement is also called a catamorphism. I've written some about what I suppose you might call a beginner's view of catamorphisms, at Does each type have a unique catamorphism?. You can derive the catamorphism for a type (or show that none can exist) quite mechanically. If your type has N constructors, the fold function must take N+1 arguments: one value of your type, and one function for each constructor. Each such function takes one argument per field that its corresponding constructor has (or, if the constructor has no fields, it takes an ordinary value, which you can imagine as a 0-ary function), and returns a value of whatever type the catamorphism returns.

It's complicated in words, so I'll copy the relevant code from the answer I linked above, as an exemplar:

data X a b f = A Int b
             | B
             | C (f a) (X a b f)
             | D a

xCata :: (Int -> b -> r)
      -> r
      -> (f a -> r -> r)
      -> (a -> r)
      -> X a b f
      -> r
xCata a b c d v = case v of
  A i x -> a i x
  B -> b
  C f x -> c f (xCata a b c d x)
  D x -> d x

Observe that each of the functions (a, b, c, d) has one argument per field in the associated constructor. In most of the cases, you simply call the function with each of the constructor's fields...but what's up with the C case? Why don't we write c f x instead of c f (xCata a b c d x)? This is where the recursion happens: cata's job is to recursively traverse (fold) the entire tree represented by your ADT, turning each X a b f value into a result of type r. Happily, there's only one possible way to do that transformation: call xCata with the same set of functions you were passed to begin with.

answered Nov 15 '22 10:11

amalloy

Related questions
                            
                                Haskell Labeled AST: No instance for (Show1 (Label a)), How to construct an instance?
                            
                                Haskell Matrix equality failing
                            
                                Use the Haskell monad "do" notation to define a syntax tree
                            
                                Hindley Milner type inference for mutually recursive functions
                            
                                Is it possible to use a type alias in a GADT definition?
                            
                                What's wrong with this YesodAuth instance?
                            
                                Haskell AST Annotation with Fix
                            
                                Why are database queries a good place to use Arrows?
                            
                                Is Behavior a Comonad?
                            
                                How to make FFI call interruptible
                            
                                Change values to indices in a `Conkin.Traversable` without `unsafeCoerce`
                            
                                Making instance of Applicative
                            
                                Find match a parametric parameter
                            
                                Why is HashMap not in normal form upon series of inserts?
                            
                                Performance issue
                            
                                f, g, h :: Kleisli ((->) e) a b <=> f >>> (g &&& h) = (f >>> g) &&& (f >>> h)?
                            
                                Why don't Haskell list comprehensions cause an error when pattern match fails?
                            
                                Sections - why do I need backticks here?
                            
                                Ambiguous type variable `a0' in the constraints
                            
                                Haskell cabal-install errors

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With