I did some Criterion benchmarks to estimate how much performance I lose by running my code over a monad stack. The results were rather curious, and I have probably stumbled upon some laziness pitfall in my benchmark.
The benchmark tells me that running WriterT String IO
is 20 times(!) slower than running plain IO
, even when not using tell
. Weirdly, if I stack WriterT
with ReaderT
and ContT
it is just 5 times slower. This probably is a bug in my benchmark. What am I doing wrong here?
{-#LANGUAGE BangPatterns#-}
module Main where
import Criterion.Main
import Control.Monad
import Control.Monad.Writer
import Control.Monad.Reader
import Control.Monad.Cont
process :: Monad m => Int -> m Int
process = foldl (>=>) return (replicate 100000 (\(!x) -> return (x+1)))
test n = process n >> return ()
main = defaultMain [
bench "Plain" t0
,bench "Writer" t1
,bench "Reader" t2
,bench "Cont" t3
,bench "RWC" t4
]
t0 = test 1 :: IO ()
t1 = (runWriterT (test 1:: WriterT String IO ()) >> return ()) :: IO ()
t2 = (runReaderT (test 1:: ReaderT String IO ()) "" >> return ()) :: IO ()
t3 = (runContT (test 1:: ContT () IO ()) (return) >> return ()) :: IO ()
t4 = ((runWriterT . flip runReaderT "" . flip runContT return $
(test 1 :: ContT () (ReaderT String (WriterT String IO)) ())) >> return ()) :: IO ()
benchmarking Plain mean: 1.938814 ms, lb 1.846508 ms, ub 2.052165 ms, ci 0.950 std dev: 519.7248 us, lb 428.4684 us, ub 709.3670 us, ci 0.950 benchmarking Writer mean: 39.50431 ms, lb 38.25233 ms, ub 40.74437 ms, ci 0.950 std dev: 6.378220 ms, lb 5.738682 ms, ub 7.155760 ms, ci 0.950 benchmarking Reader mean: 12.52823 ms, lb 12.03947 ms, ub 13.09994 ms, ci 0.950 std dev: 2.706265 ms, lb 2.324519 ms, ub 3.462641 ms, ci 0.950 benchmarking Cont mean: 8.100272 ms, lb 7.634488 ms, ub 8.633348 ms, ci 0.950 std dev: 2.562829 ms, lb 2.281561 ms, ub 2.878463 ms, ci 0.950 benchmarking RWC mean: 9.871992 ms, lb 9.436721 ms, ub 10.37302 ms, ci 0.950 std dev: 2.387364 ms, lb 2.136819 ms, ub 2.721750 ms, ci 0.950
As you've noticed, the lazy writer monad is quite slow. Using the strict version as Daniel Fischer suggests helps a lot, but why does it become so much faster when used in the big stack?
To answer this question, we take a look at the implementation of these transformers. First, the lazy writer monad transformer.
newtype WriterT w m a = WriterT { runWriterT :: m (a, w) }
instance (Monoid w, Monad m) => Monad (WriterT w m) where
return a = WriterT $ return (a, mempty)
m >>= k = WriterT $ do
~(a, w) <- runWriterT m
~(b, w') <- runWriterT (k a)
return (b, w `mappend` w')
As you can see, this does quite a lot. It runs the actions of the underlying monad, does some pattern matching and gathers the written values. Pretty much what you'd expect. The strict version is similar, only without irrefutable (lazy) patterns.
newtype ReaderT r m a = ReaderT { runReaderT :: r -> m a }
instance (Monad m) => Monad (ReaderT r m) where
return = lift . return
m >>= k = ReaderT $ \ r -> do
a <- runReaderT m r
runReaderT (k a) r
The reader transformer is a bit leaner. It distributes the reader environment and calls upon the underlying monad to perform the actions. No surprises here.
Now, let's look at ContT
.
newtype ContT r m a = ContT { runContT :: (a -> m r) -> m r }
instance Monad (ContT r m) where
return a = ContT ($ a)
m >>= k = ContT $ \c -> runContT m (\a -> runContT (k a) c)
Notice anything different? It does not actually use any functions from the underlying monad! In fact, it doesn't even require m
to be a monad. That means that no slow pattern matching or appends are being done at all. Only when you actually try to lift any actions from the underlying monad does ContT
use its bind operator.
instance MonadTrans (ContT r) where
lift m = ContT (m >>=)
So since you're not actually doing any writer-specific stuff, ContT
avoids using the slow bind operator from WriterT
. That's why having ContT
on top of your stack makes it so much faster, and why the run time of the ContT () IO ()
is so similar to that of the deeper stack.
Part of the extreme slowdown of Writer
is that you're using the lazy writer monad, so your bang-pattern doesn't help at all there, cf. the answer to this question for a more detailed explanation (although for State, but it's the same reason here). Changing that to Control.Monad.Writer.Strict
reduced the slowdown here from eight-fold to less-than-four-fold. Still the stack is faster, I haven't yet understood why.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With