I recently read [1] and [2], which speak about histomorphism (and dynamorphisms) which are recursion schemes that can express e.g. dynamic programming. Unfortunately the papers aren't accessible if you don't know category theory, even though there's code in there that looks like Haskell.
Could someone explain histomorphisms with an example that uses real Haskell code?
Let's start by defining a data type that we will use as an example:
data Nat = S Nat | Z
This data type encodes the natural numbers in Peano style. This means that we have 0 and a way to produce the successor of any natural number.
We can construct new natural numbers from integers easily:
-- Allow us to construct Nats
mkNat :: Integer -> Nat
mkNat n | n < 0 = error "cannot construct negative natural number"
mkNat 0 = Z
mkNat n = S $ mkNat (n-1)
Now, we'll first define a catamorphism for this type, because a histomorphism is quite similar to it and a catamorphism is easier to understand.
A catamorphism allows to "fold" or "tear down" a structure. It only expects a function that knows how to fold the structure when all recursive terms have been folded already. Let's define such a type, similar to Nat, but with all recursive instances replaced by some value of type a
:
data NatF a = SF a | ZF -- Aside: this is just Maybe
Now, we can define the type of our catamorphism for Nat:
cata :: (NatF a -> a)
-> (Nat -> a)
Given a function that knows how to fold the non-recursive structure NatF a
to an a
, cata
turns that into a function to fold a whole Nat
.
The implementation of cata is quite simple: first fold the recursive subterm (if there is any) and the apply our function:
cata f Z = f ZF -- No subterm to fold, base case
cata f (S subterm) = f $ SF $ cata f subterm -- Fold subterm first, recursive case
We can use this catamorphism to convert Nat
s back to Integer
s, like this:
natToInteger :: Nat -> Integer
natToInteger = cata phi where
-- We only need to provide a function to fold
-- a non-recursive Nat-like structure
phi :: NatF Integer -> Integer
phi ZF = 0
phi (SF x) = x + 1
So with cata
, we get access to the value of the immediate subterm. But imagine we like to access the values of transitive subterms too, for example, when defining a fibonacci function. Then, we need not only access to the previous value, but also to the 2-nd previous value. This is where histomorphisms come into play.
A histomorphism (histo sounds a lot like "history") allows us to access all previous values, not just the most recent one. This means we now get a list of values, not just a single one, so the type of histomorphism is:
-- We could use the type NatF (NonEmptyList a) here.
-- But because NatF is Maybe, NatF (NonEmptyList a) is equal to [a].
-- Using just [a] is a lot simpler
histo :: ([a] -> a)
-> Nat -> a
histo f = head . go where
-- go :: Nat -> [a] -- This signature would need ScopedTVs
go Z = [f []]
go (S x) = let subvalues = go x in f subvalues : subvalues
Now, we can define fibN
as follows:
-- Example: calculate the n-th fibonacci number
fibN :: Nat -> Integer
fibN = histo $ \x -> case x of
(x:y:_) -> x + y
_ -> 1
Aside: even though it might appear so, histo is not more powerful than cata. You can see that yourself by implementing histo in terms of cata and the other way around.
What I didn't show in the above example is that cata
and histo
can be implemented very generally if you define your type as a fixpoint of a functor. Our Nat
type is just the fixed point of the Functor NatF
.
If you define histo
in the generic way, then you also need to come up with a type like the NonEmptyList
in our example, but for any functor. This type is precisely Cofree f
, where f
is the functor you took the fixed point of. You can see that it works for our example: NonEmptyList
is just Cofree Maybe
. This is how you get to the generic type of histo
:
histo :: Functor f
=> (f (Cofree f a) -> a)
-> Fix f -- ^ This is the fixed point of f
-> a
You can think of f (Cofree f a)
as kind of a stack, where with each "layer", you can see a less-folded structure. At the top of the stack, every immediate subterm is folded. Then, if you go one layer deeper, the immediate subterm is no longer folded, but the sub-subterms are all already folded (or evaluated, which might make more sense to say in the case of ASTs). So you can basically see "the sequence of reductions" that has been applied (= the history).
We can think of there as being a generalization continuum from cata
to histo
to dyna
. In the terminology of recursion-schemes
:
Foldable t => (Base t a -> a) -> (t -> a) -- (1)
Foldable t => (Base t (Cofree (Base t) a) -> a) -> (t -> a) -- (2)
Functor f => (f (Cofree f a) -> a) -> (t -> f t) -> (t -> a) -- (3)
where (1) is cata
, (2) is histo
, and (3) is dyna
. A high-level overview of this generalization is that histo
improves cata
by maintaing the history of all partial "right folds" and dyna
improves histo
by letting operating on any type t
so long as we can make an f
-coalgebra for it, not just the Foldable
ones (which have universal Base t
-coalgebras as Foldable
witnesses that data types are final coalgebras).
We can almost read off their properties by simply looking at what it takes to fulfill their types.
For instance, a classic use of cata
is to define foldr
data instance Prim [a] x = Nil | Cons a x
type instance Base [a] = Prim [a]
instance Foldable [a] where
project [] = Nil
project (a:as) = Cons a as
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr cons nil = cata $ \case
Nil -> nil
Cons a b -> cons a b
importantly, we note that foldr
generates the "next" partial right fold value by using exclusively the "previous" right fold value. This is why it can be implemented using cata
: it only needs the most immediately previous partial fold result.
As histo
generalizes cata
we ought to be able to do the same with it. Here's a histo
-based foldr
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr cons nil = histo $ \case
Nil -> nil
Cons a (b :< _) -> cons a b
we can see that we no longer immediately have the immediately previous fold result, but instead have to reach into the first layer of the Cofree
in order to find it. But Cofree
is a stream and contains potentially infinitely many "previous fold values" and we can dig as deeply into it as we like. This is what gives histo
its "historical" power. For instance, we can write a fairly direct tail
using histo
which is more difficult to do with cata
alone:
tail :: [a] -> Maybe [a]
tail = histo $ \case
Nil -> Nothing -- empty list
Cons _ (b :< x) -> case x of
Nil -> Just [] -- length 1 list
Cons a _ -> fmap (a:) b
The style is a little indirect, but essentially because we can look back into the past two steps we can respond to length-1 lists differently from length-0 lists or length-n
lists.
To take the final step to generalize histo
to dyna
we simply replace the natural projection by any coalgebra. We could thus implement histo
in terms of dyna
quite easily
histo phi = dyna phi project -- project is from the Foldable class
So now we can apply histo
folds to any type which can even be partially viewed as a list (well, so long as we keep with the running example and use Prim [a]
as the Functor
, f
).
(Theoretically, there's a restriction that this coalgebra eventually halts, e.g. we can't treat infinite streams, but that has more to do with theory and optimization than use. In use, such a thing simply has to be lazy and small enough to terminate.)
(This mirrors the idea of representing initial algebras by their ability to
project :: t -> Base t t
. If this were truly a total inductive type then you could only project so many times before hitting the end.)
To replicate the Catalan numbers instance from the linked paper we can create non-empty lists
data NEL a = Some a | More a (NEL a)
data NELf a x = Somef a | Moref a x deriving Functor
and create the coalgebra on natural numbers called natural
which, suitably unfolded, produces a countdown NEL
natural :: Int -> NELf Int Int
natural 0 = Somef 0
natural n = Moref n (n-1)
then we apply a histo
-style fold to the NELf
-view of a natural number to produce the n
-th Catalan number.
-- here's a quick implementation of `dyna` using `recursion-schemes`
zcata
:: (Comonad w, Functor f) =>
(a -> f a) -> (f (w (w c)) -> w b) -> (b -> c) -> a -> c
zcata z k g = g . extract . c where
c = k . fmap (duplicate . fmap g . c) . z
dyna :: Functor f => (f (Cofree f c) -> c) -> (a -> f a) -> a -> c
dyna phi z = zcata z distHisto phi
takeC :: Int -> Cofree (NELf a) a -> [a]
takeC 0 _ = []
takeC n (a :< Somef v) = [a]
takeC n (a :< Moref v as) = a : takeC (n-1) as
catalan :: Int -> Int
catalan = dyna phi natural where
phi :: NELf Int (Cofree (NELf Int) Int) -> Int
phi (Somef 0) = 1
phi (Moref n table) = sum (zipWith (*) xs (reverse xs))
where xs = takeC n table
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With