Given an arbitrary datastructure with a fixed point, can we construct an monoidal algebra without manually specifying all cases?
Assume we are given the datatype Expr
as below. Using the recursion-schemes
library, we can derive a base functor ExprF
, which automatically also has Functor
, Foldable
and Traversable
instances.
{-# LANGUAGE DeriveFunctor, DeriveFoldable, DeriveTraversable #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TemplateHaskell #-}
import Data.Semigroup (Sum(..))
import Data.Functor.Foldable
import Data.Functor.Foldable.TH
import Prelude hiding (fail)
data Expr = Var String
| Const Int
| Add [Expr]
| Mult [Expr]
deriving Show
$(makeBaseFunctor ''Expr)
expr :: Fix ExprF
expr = ana project $ Add [Const 1, Const 2, Mult [Const 3, Const 4], Var "hello"]
Now, let's say we want to count the number of leaves in expr
. We can easily write an algebra for such a small datastructure:
alg (ConstF _) = 1
alg (VarF _) = 1
alg (AddF xs) = sum xs
alg (MulF xs) = sum xs
Now, we can call cata alg expr
, which returns 5
, the correct result.
Let's assume Expr
grows really big and complex and we don't want to manually write cases for all data constructors. How does cata
know how to combine the results from all cases? I suspect that this is possible using Monoid
s, possibly in conjunction with the Const
functor (not entirely sure about that last part though).
fail = getSum $ foldMap (const (Sum 1) . unfix) $ unfix expr
fail
returns 4
, whereas we actually have 5
leaves. I assume that the problem lies in the fixed point, because we are only able to peel of one layer of Fix
ing and therefore the Mult [..]
is only counted as one leaf.
Is it possible to somehow generically fold the entire fixed point and collecting the results in a Monoid
-like structure without manually specifying all instances? What I want is kind of foldMap
but in a more generic way.
I have a feeling I am missing something really obvious.
Here's the essence of a solution. I've switched on
{-# LANGUAGE DeriveFunctor, DeriveFoldable, DeriveTraversable, PatternSynonyms #-}
Let's just recap fixpoints and catamorphisms.
newtype Fix f = In {out :: f (Fix f)}
cata :: Functor f => (f t -> t) -> Fix f -> t
cata alg = alg . fmap (cata alg) . out
The algebra, alg :: f t -> t
, takes a node where the children have already been replaced by a t
value, then returns the t
for the parent. The cata
operator works by unpacking the parent node, processing all its children recursively, then applying alg
to finish the job.
So, if we want to count leaves in such a structure, we can start like this:
leaves :: (Foldable f, Functor f) => Fix f -> Integer
leaves = cata sumOrOne where
-- sumOrOne :: f Integer -> Integer
The algebra, sumOrOne
can see the number of leaves in each child of the parent node. We can use cata
because f
is a Functor
. And because f
is Foldable
, we can compute the total number of leaves in the children.
sumOrOne fl = case sum fl of
...
There are then two possibilities: if the parent has no children, its leaf sum will be 0
, which we can detect, but that means the parent is itself a leaf, so 1
should be returned. Otherwise, the leaf sum will be nonzero, in which case the parent is not a leaf, so its leaf sum is indeed the total leaf sum of its children. That gives us
leaves :: (Foldable f, Functor f) => Fix f -> Integer
leaves = cata sumOrOne where
sumOrOne fl{- number of leaves in each child-} = case sum fl of
0 -> 1 -- no leaves in my children means I am a leaf
l -> l -- otherwise, pass on the total
A quick example, based on Hutton's Razor (the expression language with integers and addition, which is often the simplest thing that illustrates the point). The expressions are generated from Hutton's functor.
data HF h = Val Int | h :+: h deriving (Functor, Foldable, Traversable)
I introduce some pattern synonyms to recover the look and feel of a bespoke type.
pattern V x = In (Val x)
pattern s :+ t = In (s :+: t)
I cook up a quick example expression, with some leaves that are three levels deep.
example :: Fix HF
example = (V 1 :+ V 2) :+ ((V 3 :+ V 4) :+ V 5)
Sure enough
Ok, modules loaded: Leaves.
*Leaves> leaves example
5
An alternative approach is to be functorial and foldable in substructures of interest, in this case, stuff at leaves. (We get exactly the free monads.)
data Tree f x = Leaf x | Node (f (Tree f x)) deriving (Functor, Foldable)
Once you've made the leaf/node separation part of your basic construction, you can visit the leaves directly with foldMap
. Throwing in a bit of Control.Newtype
, we get
ala' Sum foldMap (const 1) :: Foldable f => f x -> Integer
which is below the Fairbairn Threshold (i.e., short enough not to need a name and all the clearer for not having one).
The trouble, of course, is that data structures are often functorial in "substructures of interest" in multiple interesting but conflicting ways. Haskell isn't always the best at letting us access "found functoriality": we somehow have to predict the functoriality we need when we parametrise a data type at declaration time. But there's still time to change all that...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With