I have been learning Haskell over the past few months and I came across an example of Monoids that has me puzzled. Given these definitions: <pre class="prettyprint"><code>data Tree a = Empty | Node a (Tree a) (Tree a) deriving (Show, Read, Eq) instance F.Foldable Tree where foldMap f Empty = mempty foldMap f (Node x l r) = F.foldMap f l `mappend` f x `mappend` F.foldMap f r </code></pre> And this Tree: <pre class="prettyprint"><code>testTree = Node 5 (Node 3 (Node 1 Empty Empty) (Node 6 Empty Empty) ) (Node 9 (Node 8 Empty Empty) (Node 10 Empty Empty) ) </code></pre> If I run: <pre class="prettyprint"><code>ghci> F.foldl (+) 0 testTree 42 ghci> F.foldl (*) 1 testTree 64800 </code></pre> How does GHCi know what Monoid to use for the mappend when it's folding? Because by default the numbers in the tree are just of the type Num, and we never explicitly said they where of some Monoid such as Sum or Product. So how does GHCi infer the correct Monoid to use? Or am I totally off at this point? Example source: http://learnyouahaskell.com/functors-applicative-functors-and-monoids#monoids

Short answer: it is a type constraint in the signature of <code>foldMap</code>. If we look to the source code of <code>Foldable</code> (more specifically <code>foldMap</code>), we see: <pre class="prettyprint"><code>class Foldable (t :: * -> *) where ... foldMap :: Monoid m => (a -> m) -> t a -> m</code></pre> So that means that if we declare <code>Tree</code> a member of <code>Foldable</code> (not that <code>Tree</code> has kind <code>* -> *</code>), it means that a <code>foldMap</code> is defined over that tree, such that: <code>foldMap :: Monoid m => (a -> m) -> Tree a -> m</code>. So this thus means that the result type (and the result of the function passed to <code>foldMap</code>) <code>m</code> must be a <code>Monoid</code>. Haskell is statically typed: after compile time, Haskell knows exactly the types that are passed in an out of every function instance. So that means it knows for instance what the output type will be, and thus how to handle it. Now <code>Int</code> is not an instance of <code>Monoid</code>. You here use <code>F.foldl (+) 0 testTree</code>, so that means that you more or less have constructed an "ad hoc" monoid. This works if we look at the source code of <code>foldl</code>: <blockquote> <pre class="prettyprint"><code>foldl :: (b -> a -> b) -> b -> t a -> b foldl f z t = appEndo (getDual (foldMap (Dual . Endo . flip f) t)) z </code></pre> </blockquote> This has a lot of logic surrounding the parameters <code>f</code>, <code>z</code> and <code>t</code>. So let us break that down first. Let us first take a look at <code>Dual . Endo . flip f</code>. This is short for: <pre class="prettyprint"><code>helper = \x -> Dual (Endo (\y -> f y x)) </code></pre> <code>Dual</code> and <code>Endo</code> are types with each one constructor that takes one parameter. So we wrap the outcome of <code>f y x</code> in the <code>Dual (Endo ...)</code> constructors. We will use this as the function of <code>foldMap</code>. If our <code>f</code> has type <code>a -> b -> a</code>, then this function has type <code>b -> Dual (Endo a)</code>. So the output type of the function passed to <code>foldMap</code> has output type <code>Dual (Endo a)</code>. Now if we inspect the source code, we see two intersting things: <blockquote> <pre class="prettyprint"><code>instance Monoid (Endo a) where mempty = Endo id Endo f `mappend` Endo g = Endo (f . g) instance Monoid a => Monoid (Dual a) where mempty = Dual mempty Dual x `mappend` Dual y = Dual (y `mappend` x) </code></pre> </blockquote> (note that it is <code>y `mappend` x</code>, not <code>x `mappend` y</code>). So what happens here is that the <code>mempty</code> that is used in the <code>foldMap</code> is <code>mempty = Dual mempty = Dual (Endo id)</code>. So a <code>Dual (Endo ...)</code> that encapsulates the identity function. Furthermore the <code>mappend</code> of two duals comes down to function composition of the values in <code>Endo</code>. So: <pre class="prettyprint"><code>mempty = Dual (Endo id) mappend (Dual (Endo f)) (Dual (Endo g)) = Dual (Endo (g . f)) </code></pre> So that means that if we fold over the tree, in case we see an <code>Empty</code> (a leaf), we will return <code>mempty</code>, and in case we see a <code>Node x l r</code>, we will perform the <code>mappend</code> as described above. So a "specialized" <code>foldMap</code> will look like: <pre class="prettyprint"><code>-- specialized foldMap foldMap f Empty = Dual (Endo id) foldMap f (Node x l r) = Dual (Endo (c . b . a)) where Dual (Endo a) = foldMap f l Dual (Endo b) = helper x Dual (Endo c) = foldMap f l </code></pre> So for every <code>Node</code> we make a function composition right-to-left over the the children and item of the node. <code>a</code> and <code>c</code> can be compositions of the tree as well (since these are recursive calls). In case of a <code>Leaf</code>, we do nothing (we return <code>id</code>, but the composition over an <code>id</code> is a no-op). So that means that if we have a tree: <pre class="prettyprint"><code>5 |- 3 | |- 1 | `- 6 `- 9 |- 8 `- 10 </code></pre> This will result in a function: <pre class="prettyprint"><code>(Dual (Endo ( (\x -> f x 10) . (\x -> f x 9) . (\x -> f x 8) . (\x -> f x 5) . (\x -> f x 6) . (\x -> f x 3) . (\x -> f x 1) ) ) ) </code></pre> (omitted the identities, to make it a cleaner). This is the outcome of <code>getDual (foldMap (Dual . Endo . flip f))</code>. But now we need to post process this result. With <code>getDual</code>, we get the content wrapped in the <code>Dual</code> constructor. So now we have: <pre class="prettyprint"><code>Endo ( (\x -> f x 10) . (\x -> f x 9) . (\x -> f x 8) . (\x -> f x 5) . (\x -> f x 6) . (\x -> f x 3) . (\x -> f x 1) ) </code></pre> and with <code>appEndo</code>, we obtain the function wrapped in <code>Endo</code>, so: <pre class="prettyprint"><code>( (\x -> f x 10) . (\x -> f x 9) . (\x -> f x 8) . (\x -> f x 5) . (\x -> f x 6) . (\x -> f x 3) . (\x -> f x 1) ) </code></pre> and then we apply this to <code>z</code> the "initial" value. So that means that we will process the chain starting with <code>z</code> (the initial element), and apply it like: <pre class="prettyprint"><code>f (f (f (f (f (f (f z 1) 3) 6) 5) 8) 9) 10 </code></pre> So we have constructed some sort of Monoid, where <code>mappend</code> is replaced by <code>f</code>, and <code>mempty</code> as a no-op (the identity function).

Monoids and Num in Haskell

Tags:

haskell

monoids

I have been learning Haskell over the past few months and I came across an example of Monoids that has me puzzled.

Given these definitions:

data Tree a = Empty | Node a (Tree a) (Tree a) deriving (Show, Read, Eq) 

instance F.Foldable Tree where  
    foldMap f Empty = mempty  
    foldMap f (Node x l r) = F.foldMap f l `mappend`  
                             f x           `mappend`  
                             F.foldMap f r

And this Tree:

testTree = Node 5  
        (Node 3  
            (Node 1 Empty Empty)  
            (Node 6 Empty Empty)  
        )  
        (Node 9  
            (Node 8 Empty Empty)  
            (Node 10 Empty Empty)  
        )

If I run:

ghci> F.foldl (+) 0 testTree  
42  
ghci> F.foldl (*) 1 testTree  
64800

How does GHCi know what Monoid to use for the mappend when it's folding? Because by default the numbers in the tree are just of the type Num, and we never explicitly said they where of some Monoid such as Sum or Product.

So how does GHCi infer the correct Monoid to use? Or am I totally off at this point?

Example source: http://learnyouahaskell.com/functors-applicative-functors-and-monoids#monoids

224

asked Nov 30 '17 15:11

Bradley Nowacki

2 Answers

Short answer: it is a type constraint in the signature of foldMap.

If we look to the source code of Foldable (more specifically foldMap), we see:

class Foldable (t :: * -> *) where
  ...
  foldMap :: Monoid m => (a -> m) -> t a -> m

So that means that if we declare Tree a member of Foldable (not that Tree has kind * -> *), it means that a foldMap is defined over that tree, such that: foldMap :: Monoid m => (a -> m) -> Tree a -> m. So this thus means that the result type (and the result of the function passed to foldMap) m must be a Monoid.

Haskell is statically typed: after compile time, Haskell knows exactly the types that are passed in an out of every function instance. So that means it knows for instance what the output type will be, and thus how to handle it.

Now Int is not an instance of Monoid. You here use F.foldl (+) 0 testTree, so that means that you more or less have constructed an "ad hoc" monoid. This works if we look at the source code of foldl:

foldl :: (b -> a -> b) -> b -> t a -> b
foldl f z t = appEndo (getDual (foldMap (Dual . Endo . flip f) t)) z

This has a lot of logic surrounding the parameters f, z and t. So let us break that down first.

Let us first take a look at Dual . Endo . flip f. This is short for:

helper = \x -> Dual (Endo (\y -> f y x))

Dual and Endo are types with each one constructor that takes one parameter. So we wrap the outcome of f y x in the Dual (Endo ...) constructors.

We will use this as the function of foldMap. If our f has type a -> b -> a, then this function has type b -> Dual (Endo a). So the output type of the function passed to foldMap has output type Dual (Endo a). Now if we inspect the source code, we see two intersting things:

instance Monoid (Endo a) where
        mempty = Endo id
        Endo f `mappend` Endo g = Endo (f . g)

instance Monoid a => Monoid (Dual a) where
        mempty = Dual mempty
        Dual x `mappend` Dual y = Dual (y `mappend` x)

(note that it is y `mappend` x, not x `mappend` y).

So what happens here is that the mempty that is used in the foldMap is mempty = Dual mempty = Dual (Endo id). So a Dual (Endo ...) that encapsulates the identity function.

Furthermore the mappend of two duals comes down to function composition of the values in Endo. So:

mempty = Dual (Endo id)
mappend (Dual (Endo f)) (Dual (Endo g)) = Dual (Endo (g . f))

So that means that if we fold over the tree, in case we see an Empty (a leaf), we will return mempty, and in case we see a Node x l r, we will perform the mappend as described above. So a "specialized" foldMap will look like:

-- specialized foldMap
foldMap f Empty = Dual (Endo id)  
foldMap f (Node x l r) =  Dual (Endo (c . b . a))
        where Dual (Endo a) = foldMap f l
              Dual (Endo b) = helper x
              Dual (Endo c) = foldMap f l

So for every Node we make a function composition right-to-left over the the children and item of the node. a and c can be compositions of the tree as well (since these are recursive calls). In case of a Leaf, we do nothing (we return id, but the composition over an id is a no-op).

So that means that if we have a tree:

5
|- 3
|  |- 1
|  `- 6
`- 9
   |- 8
   `- 10

This will result in a function:

(Dual (Endo ( (\x -> f x 10) .
              (\x -> f x 9) .
              (\x -> f x 8) .
              (\x -> f x 5) .
              (\x -> f x 6) .
              (\x -> f x 3) .
              (\x -> f x 1)
            )
      )
)

(omitted the identities, to make it a cleaner). This is the outcome of getDual (foldMap (Dual . Endo . flip f)). But now we need to post process this result. With getDual, we get the content wrapped in the Dual constructor. So now we have:

Endo ( (\x -> f x 10) .
       (\x -> f x 9) .
       (\x -> f x 8) .
       (\x -> f x 5) .
       (\x -> f x 6) .
       (\x -> f x 3) .
       (\x -> f x 1)
     )

and with appEndo, we obtain the function wrapped in Endo, so:

( (\x -> f x 10) .
  (\x -> f x 9) .
  (\x -> f x 8) .
  (\x -> f x 5) .
  (\x -> f x 6) .
  (\x -> f x 3) .
  (\x -> f x 1)
)

and then we apply this to z the "initial" value. So that means that we will process the chain starting with z (the initial element), and apply it like:

f (f (f (f (f (f (f z 1) 3) 6) 5) 8) 9) 10

So we have constructed some sort of Monoid, where mappend is replaced by f, and mempty as a no-op (the identity function).

answered Oct 07 '22 22:10

Willem Van Onsem

It doesn't need to. foldl is translated into foldr which translates into foldMap over Endo which means function composition which means simple nesting of the function you have supplied.

Or something. Meaning, foldl could be translated into foldMap over Dual . Endo which composes left-to-right, etc..

update: yes, the docs says:

Foldable instances are expected to satisfy the following laws:

foldr f z t = appEndo (foldMap (Endo . f) t ) z
foldl f z t = appEndo (getDual (foldMap (Dual . Endo . flip f) t)) z  -- << --
fold = foldMap id

Dual (Endo f) <> Dual (Endo g) = Dual (Endo g <> Endo f) = Dual (Endo (g . f)). So when appEndo strikes, the chain of functions that's been built, i.e.

    ((+10) . (+9) . (+8) . (+5) . ... . (+1))

or equivalent (here, shown for the (+) case), is applied to the user-supplied value -- in your case,

Another thing to notice is that Endo and Dual are newtypes, so all these machinations will be done by compiler, and be gone by run time.

answered Oct 07 '22 20:10

Will Ness

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Monoids and Num in Haskell

Tags:

haskell

monoids

Bradley Nowacki

People also ask

2 Answers

Willem Van Onsem

Will Ness

Recent Activity

Donate For Us

Monoids and Num in Haskell

Tags:

haskell

monoids

Bradley Nowacki

People also ask

2 Answers

Willem Van Onsem

Will Ness

Related questions

Recent Activity

Donate For Us