Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the point of having a lazy/strict version of Writer?

Why are there two different Writer-type monads in Haskell? Intuitively to me, reading "strict writer monad" means that the <> is strict, so that there's no thunk buildup in the log. However, looking at the source code, it turns out that that isn't the case:

-- Lazy Writer
instance (Monoid w, Monad m) => Monad (WriterT w m) where
-- ...
m >>= k  = WriterT $ do
    ~(a, w)  <- runWriterT m
    ~(b, w') <- runWriterT (k a)
    return (b, w <> w')

In the strict version the patterns aren't irrefutable, i.e. the ~ are missing. So what happens above is that m and k a are not evaluated, but stored as thunks. In the strict version, they are evaluated to check whether they match the tuple patterns, the result is fed to <>. In both cases, the >>= isn't evaluated until something actually demands the resulting value. So the way I understand it is that both the lazy and strict versions do the same thing, except that they have the thunk in a different place inside the definition of >>=: lazy produces runWriterT thunks, strict produces <> thunks.

This leaves me with two questions:

  1. Is the above right, or do I misunderstand evaluation here?
  2. Can I accomplish strict <> without writing my own wrapper and instance?
like image 543
David Avatar asked Feb 01 '13 11:02

David


1 Answers

You first observation is correct, but this distinction between which thunks get created is important.

Lazy and Strict aren't about the strictness in the log type, but instead about the strictness in the pair.

These arise because a pair in Haskell has two possible ways to update it.

bimap f g (a,b) = (f a, g b)

or

bimap f g ~(a,b) = (f a, g b)

The latter is the same as

bimap f g p = (f (fst p), g (snd p))

The difference between these two is that when you pass the args to bimap in the first case, the pair is forced immediately.

In the latter case the pair is not immediately forced, but I instead hand you a (,) back filled with two non-strict computations.

This means that

fmap f _|_ = _|_ 

in the first case but

fmap f _|_ = (_|_, _|_)

in the second lazier pair case!

Both are correct under different interpretations of the concept of a pair. One is forced on you by pretending a pair is a pair in the categorical sense, that it doesn't have any interesting _|_'s in its own right. On the other hand, the interpretation of the domain as being as non-strict. as possible so you can have as many programs terminate as possible ushes you to the Lazy version.

(,) e is a perfectly admissable Writer, so this characterizes the problem.

The reason the distinction is made is that it matters for the termination of many exotic programs that take a fixed point through the monad. You can answer questions about certain circular programs involving state or writer, so long as they are Lazy.

Note, in neither case is this strict in the 'log' argument. Once you incur strictness in that you lose proper associativity and cease technically to be a Monad. =/

Because this isn't a monad, we don't supply it in the mtl!

With that, we can address your second question:

There are some workarounds though. You can construct a fake Writer on top of State. Basically pretend you aren't handed a state argument. and just mappend into the state as you would tell. Now you can do this strictly, because it isn't happening behind your back as part of every bind. The State is just passing through the state unmodified between actions.

shout :: Monoid s => s -> Strict.StateT s m ()
shout s' = do
   s <- get
   put $! s <> s'

This does, however mean that you force your entire State monad to get the output, and cannot produce parts of the Monoid lazily but you get something that is operationally closer to what an strict programmer would expect. Interestingly this works even with just Semigroup, because the only use of mempty is effectively at the start when you runState.

like image 96
Edward Kmett Avatar answered Nov 11 '22 09:11

Edward Kmett