I have been learning Haskell over the past few months and I came across an example of Monoids that has me puzzled.
Given these definitions:
data Tree a = Empty | Node a (Tree a) (Tree a) deriving (Show, Read, Eq)
instance F.Foldable Tree where
foldMap f Empty = mempty
foldMap f (Node x l r) = F.foldMap f l `mappend`
f x `mappend`
F.foldMap f r
And this Tree:
testTree = Node 5
(Node 3
(Node 1 Empty Empty)
(Node 6 Empty Empty)
)
(Node 9
(Node 8 Empty Empty)
(Node 10 Empty Empty)
)
If I run:
ghci> F.foldl (+) 0 testTree
42
ghci> F.foldl (*) 1 testTree
64800
How does GHCi know what Monoid to use for the mappend when it's folding? Because by default the numbers in the tree are just of the type Num, and we never explicitly said they where of some Monoid such as Sum or Product.
So how does GHCi infer the correct Monoid to use? Or am I totally off at this point?
Example source: http://learnyouahaskell.com/functors-applicative-functors-and-monoids#monoids
As we can see, Monoid in Haskell is a typeclass that has three methods: mempty , mappend , and mconcat . mempty is a value that doesn't impact the result when used together with the binary operation. In other words, it's the identity element of the monoid. mappend (or <> ) is a function that puts two monoids together.
On mconcat Or in other words, it's telling you something like what associativity tells you, that the order in which you fold up a list doesn't matter.
Strings, lists, and sequences are essentially the same monoid. An introduction for object-oriented programmers. This article is part of a series about monoids. In short, a monoid is an associative binary operation with a neutral element (also known as identity).
Function Composition Is A Monoid Compose takes two functions as arguments, and returns a function. So it's a magma. So according to the definition, functions form a monoid under the composition operation.
Short answer: it is a type constraint in the signature of foldMap
.
If we look to the source code of Foldable
(more specifically foldMap
), we see:
class Foldable (t :: * -> *) where
...
foldMap :: Monoid m => (a -> m) -> t a -> m
So that means that if we declare Tree
a member of Foldable
(not that Tree
has kind * -> *
), it means that a foldMap
is defined over that tree, such that: foldMap :: Monoid m => (a -> m) -> Tree a -> m
. So this thus means that the result type (and the result of the function passed to foldMap
) m
must be a Monoid
.
Haskell is statically typed: after compile time, Haskell knows exactly the types that are passed in an out of every function instance. So that means it knows for instance what the output type will be, and thus how to handle it.
Now Int
is not an instance of Monoid
. You here use F.foldl (+) 0 testTree
, so that means that you more or less have constructed an "ad hoc" monoid. This works if we look at the source code of foldl
:
foldl :: (b -> a -> b) -> b -> t a -> b foldl f z t = appEndo (getDual (foldMap (Dual . Endo . flip f) t)) z
This has a lot of logic surrounding the parameters f
, z
and t
. So let us break that down first.
Let us first take a look at Dual . Endo . flip f
. This is short for:
helper = \x -> Dual (Endo (\y -> f y x))
Dual
and Endo
are types with each one constructor that takes one parameter. So we wrap the outcome of f y x
in the Dual (Endo ...)
constructors.
We will use this as the function of foldMap
. If our f
has type a -> b -> a
, then this function has type b -> Dual (Endo a)
. So the output type of the function passed to foldMap
has output type Dual (Endo a)
. Now if we inspect the source code, we see two intersting things:
instance Monoid (Endo a) where mempty = Endo id Endo f `mappend` Endo g = Endo (f . g) instance Monoid a => Monoid (Dual a) where mempty = Dual mempty Dual x `mappend` Dual y = Dual (y `mappend` x)
(note that it is y `mappend` x
, not x `mappend` y
).
So what happens here is that the mempty
that is used in the foldMap
is mempty = Dual mempty = Dual (Endo id)
. So a Dual (Endo ...)
that encapsulates the identity function.
Furthermore the mappend
of two duals comes down to function composition of the values in Endo
. So:
mempty = Dual (Endo id)
mappend (Dual (Endo f)) (Dual (Endo g)) = Dual (Endo (g . f))
So that means that if we fold over the tree, in case we see an Empty
(a leaf), we will return mempty
, and in case we see a Node x l r
, we will perform the mappend
as described above. So a "specialized" foldMap
will look like:
-- specialized foldMap
foldMap f Empty = Dual (Endo id)
foldMap f (Node x l r) = Dual (Endo (c . b . a))
where Dual (Endo a) = foldMap f l
Dual (Endo b) = helper x
Dual (Endo c) = foldMap f l
So for every Node
we make a function composition right-to-left over the the children and item of the node. a
and c
can be compositions of the tree as well (since these are recursive calls). In case of a Leaf
, we do nothing (we return id
, but the composition over an id
is a no-op).
So that means that if we have a tree:
5
|- 3
| |- 1
| `- 6
`- 9
|- 8
`- 10
This will result in a function:
(Dual (Endo ( (\x -> f x 10) .
(\x -> f x 9) .
(\x -> f x 8) .
(\x -> f x 5) .
(\x -> f x 6) .
(\x -> f x 3) .
(\x -> f x 1)
)
)
)
(omitted the identities, to make it a cleaner). This is the outcome of getDual (foldMap (Dual . Endo . flip f))
. But now we need to post process this result. With getDual
, we get the content wrapped in the Dual
constructor. So now we have:
Endo ( (\x -> f x 10) .
(\x -> f x 9) .
(\x -> f x 8) .
(\x -> f x 5) .
(\x -> f x 6) .
(\x -> f x 3) .
(\x -> f x 1)
)
and with appEndo
, we obtain the function wrapped in Endo
, so:
( (\x -> f x 10) .
(\x -> f x 9) .
(\x -> f x 8) .
(\x -> f x 5) .
(\x -> f x 6) .
(\x -> f x 3) .
(\x -> f x 1)
)
and then we apply this to z
the "initial" value. So that means that we will process the chain starting with z
(the initial element), and apply it like:
f (f (f (f (f (f (f z 1) 3) 6) 5) 8) 9) 10
So we have constructed some sort of Monoid, where mappend
is replaced by f
, and mempty
as a no-op (the identity function).
It doesn't need to. foldl
is translated into foldr
which translates into foldMap
over Endo
which means function composition which means simple nesting of the function you have supplied.
Or something. Meaning, foldl
could be translated into foldMap
over Dual . Endo
which composes left-to-right, etc..
update: yes, the docs says:
Foldable instances are expected to satisfy the following laws:
foldr f z t = appEndo (foldMap (Endo . f) t ) z foldl f z t = appEndo (getDual (foldMap (Dual . Endo . flip f) t)) z -- << -- fold = foldMap id
Dual (Endo f) <> Dual (Endo g) = Dual (Endo g <> Endo f) = Dual (Endo (g . f))
. So when appEndo
strikes, the chain of functions that's been built, i.e.
((+10) . (+9) . (+8) . (+5) . ... . (+1))
or equivalent (here, shown for the (+)
case), is applied to the user-supplied value -- in your case,
0
Another thing to notice is that Endo
and Dual
are newtype
s, so all these machinations will be done by compiler, and be gone by run time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With