Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Monoids and Num in Haskell

I have been learning Haskell over the past few months and I came across an example of Monoids that has me puzzled.

Given these definitions:

data Tree a = Empty | Node a (Tree a) (Tree a) deriving (Show, Read, Eq) 

instance F.Foldable Tree where  
    foldMap f Empty = mempty  
    foldMap f (Node x l r) = F.foldMap f l `mappend`  
                             f x           `mappend`  
                             F.foldMap f r  

And this Tree:

testTree = Node 5  
        (Node 3  
            (Node 1 Empty Empty)  
            (Node 6 Empty Empty)  
        )  
        (Node 9  
            (Node 8 Empty Empty)  
            (Node 10 Empty Empty)  
        )  

If I run:

ghci> F.foldl (+) 0 testTree  
42  
ghci> F.foldl (*) 1 testTree  
64800  

How does GHCi know what Monoid to use for the mappend when it's folding? Because by default the numbers in the tree are just of the type Num, and we never explicitly said they where of some Monoid such as Sum or Product.

So how does GHCi infer the correct Monoid to use? Or am I totally off at this point?

Example source: http://learnyouahaskell.com/functors-applicative-functors-and-monoids#monoids

like image 224
Bradley Nowacki Avatar asked Nov 30 '17 15:11

Bradley Nowacki


People also ask

What are monoids in Haskell?

As we can see, Monoid in Haskell is a typeclass that has three methods: mempty , mappend , and mconcat . mempty is a value that doesn't impact the result when used together with the binary operation. In other words, it's the identity element of the monoid. mappend (or <> ) is a function that puts two monoids together.

What is Haskell Mconcat?

On mconcat Or in other words, it's telling you something like what associativity tells you, that the order in which you fold up a list doesn't matter.

Are lists monoids?

Strings, lists, and sequences are essentially the same monoid. An introduction for object-oriented programmers. This article is part of a series about monoids. In short, a monoid is an associative binary operation with a neutral element (also known as identity).

Are functions monoids?

Function Composition Is A Monoid Compose takes two functions as arguments, and returns a function. So it's a magma. So according to the definition, functions form a monoid under the composition operation.


2 Answers

Short answer: it is a type constraint in the signature of foldMap.

If we look to the source code of Foldable (more specifically foldMap), we see:

class Foldable (t :: * -> *) where
  ...
  foldMap :: Monoid m => (a -> m) -> t a -> m

So that means that if we declare Tree a member of Foldable (not that Tree has kind * -> *), it means that a foldMap is defined over that tree, such that: foldMap :: Monoid m => (a -> m) -> Tree a -> m. So this thus means that the result type (and the result of the function passed to foldMap) m must be a Monoid.

Haskell is statically typed: after compile time, Haskell knows exactly the types that are passed in an out of every function instance. So that means it knows for instance what the output type will be, and thus how to handle it.

Now Int is not an instance of Monoid. You here use F.foldl (+) 0 testTree, so that means that you more or less have constructed an "ad hoc" monoid. This works if we look at the source code of foldl:

foldl :: (b -> a -> b) -> b -> t a -> b
foldl f z t = appEndo (getDual (foldMap (Dual . Endo . flip f) t)) z

This has a lot of logic surrounding the parameters f, z and t. So let us break that down first.

Let us first take a look at Dual . Endo . flip f. This is short for:

helper = \x -> Dual (Endo (\y -> f y x))

Dual and Endo are types with each one constructor that takes one parameter. So we wrap the outcome of f y x in the Dual (Endo ...) constructors.

We will use this as the function of foldMap. If our f has type a -> b -> a, then this function has type b -> Dual (Endo a). So the output type of the function passed to foldMap has output type Dual (Endo a). Now if we inspect the source code, we see two intersting things:

instance Monoid (Endo a) where
        mempty = Endo id
        Endo f `mappend` Endo g = Endo (f . g)

instance Monoid a => Monoid (Dual a) where
        mempty = Dual mempty
        Dual x `mappend` Dual y = Dual (y `mappend` x)

(note that it is y `mappend` x, not x `mappend` y).

So what happens here is that the mempty that is used in the foldMap is mempty = Dual mempty = Dual (Endo id). So a Dual (Endo ...) that encapsulates the identity function.

Furthermore the mappend of two duals comes down to function composition of the values in Endo. So:

mempty = Dual (Endo id)
mappend (Dual (Endo f)) (Dual (Endo g)) = Dual (Endo (g . f))

So that means that if we fold over the tree, in case we see an Empty (a leaf), we will return mempty, and in case we see a Node x l r, we will perform the mappend as described above. So a "specialized" foldMap will look like:

-- specialized foldMap
foldMap f Empty = Dual (Endo id)  
foldMap f (Node x l r) =  Dual (Endo (c . b . a))
        where Dual (Endo a) = foldMap f l
              Dual (Endo b) = helper x
              Dual (Endo c) = foldMap f l

So for every Node we make a function composition right-to-left over the the children and item of the node. a and c can be compositions of the tree as well (since these are recursive calls). In case of a Leaf, we do nothing (we return id, but the composition over an id is a no-op).

So that means that if we have a tree:

5
|- 3
|  |- 1
|  `- 6
`- 9
   |- 8
   `- 10

This will result in a function:

(Dual (Endo ( (\x -> f x 10) .
              (\x -> f x 9) .
              (\x -> f x 8) .
              (\x -> f x 5) .
              (\x -> f x 6) .
              (\x -> f x 3) .
              (\x -> f x 1)
            )
      )
)

(omitted the identities, to make it a cleaner). This is the outcome of getDual (foldMap (Dual . Endo . flip f)). But now we need to post process this result. With getDual, we get the content wrapped in the Dual constructor. So now we have:

Endo ( (\x -> f x 10) .
       (\x -> f x 9) .
       (\x -> f x 8) .
       (\x -> f x 5) .
       (\x -> f x 6) .
       (\x -> f x 3) .
       (\x -> f x 1)
     )

and with appEndo, we obtain the function wrapped in Endo, so:

( (\x -> f x 10) .
  (\x -> f x 9) .
  (\x -> f x 8) .
  (\x -> f x 5) .
  (\x -> f x 6) .
  (\x -> f x 3) .
  (\x -> f x 1)
)

and then we apply this to z the "initial" value. So that means that we will process the chain starting with z (the initial element), and apply it like:

f (f (f (f (f (f (f z 1) 3) 6) 5) 8) 9) 10

So we have constructed some sort of Monoid, where mappend is replaced by f, and mempty as a no-op (the identity function).

like image 57
Willem Van Onsem Avatar answered Oct 07 '22 22:10

Willem Van Onsem


It doesn't need to. foldl is translated into foldr which translates into foldMap over Endo which means function composition which means simple nesting of the function you have supplied.

Or something. Meaning, foldl could be translated into foldMap over Dual . Endo which composes left-to-right, etc..

update: yes, the docs says:

Foldable instances are expected to satisfy the following laws:

foldr f z t = appEndo (foldMap (Endo . f) t ) z
foldl f z t = appEndo (getDual (foldMap (Dual . Endo . flip f) t)) z  -- << --
fold = foldMap id

Dual (Endo f) <> Dual (Endo g) = Dual (Endo g <> Endo f) = Dual (Endo (g . f)). So when appEndo strikes, the chain of functions that's been built, i.e.

    ((+10) . (+9) . (+8) . (+5) . ... . (+1))

or equivalent (here, shown for the (+) case), is applied to the user-supplied value -- in your case,

                                                 0

Another thing to notice is that Endo and Dual are newtypes, so all these machinations will be done by compiler, and be gone by run time.

like image 25
Will Ness Avatar answered Oct 07 '22 20:10

Will Ness