Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using parallel strategies with monads

I often see the usage and explanation of Haskell's parallel strategies connected to pure computations (for example fib). However, I do not often see it used with monadic constructions: is there a reasonable interpretation of the effect of par and related functions when applied to ST s or IO ? Would any speedup be gained from such a usage?

like image 349
ScottWest Avatar asked Jan 04 '11 10:01

ScottWest


2 Answers

Parallelism in the IO monad is more correctly called "Concurrency", and is supported by forkIO and friends in the Control.Concurrent module.

The difficulty with parallelising the ST monad is that ST is necessarily single-threaded - that's its purpose. There is a lazy variant of the ST monad, Control.Monad.ST.Lazy, which in principle could support parallel evaluation, but I'm not aware of anyone having tried to do this.

There's a new monad for parallel evaluation called Eval, which can be found in recent versions of the parallel package. I recommend using the Eval monad with rpar and rseq instead of par and pseq these days, because it leads to more robust and readable code. For example, the usual fib example can be written

fib n = if n < 2 then 1 else 
        runEval $ do
           x <- rpar (fib (n-1))
           y <- rseq (fib (n-2))
           return (x+y)
like image 196
Simon Marlow Avatar answered Oct 30 '22 17:10

Simon Marlow


There are some situations where this makes sense, but in general you shouldn't do it. Examine the following:

doPar =
  let a = unsafePerformIO $ someIOCalc 1
      b = unsafePerformIO $ someIOCalc 2
  in a `par` b `pseq` a+b

in doPar, a calculation for a is sparked, then the main thread evaluates b. But, it's possible that after the main thread finishes the calculation of b it will begin to evaluate a as well. Now you have two threads evaluating a, meaning that some of the IO actions will be performed twice (or possibly more). But if one thread finishes evaluating a, the other will just drop what it's done so far. In order for this to be safe, you need a few things to be true:

  1. It's safe for the IO actions to be performed multiple times.
  2. It's safe for only some of the IO actions to be performed (e.g. there's no cleanup)
  3. The IO actions are free of any race conditions. If one thread mutates some data when evaluating a, will the other thread also working on a behave sensibly? Probably not.
  4. Any foreign calls are re-entrant (you need this for concurrency in general of course)

If your someIOCalc looks like this

someIOCalc n = do
  prelaunchMissiles
  threadDelay n
  launchMissiles

it's absolutely not safe to use this with par and unsafePerformIO.

Now, is it ever worth it? Maybe. Sparks are cheap, even cheaper than threads, so in theory it should be a performance gain. In practice, perhaps not so much. Roman Leschinsky has a nice blog post about this.

Personally, I've found it much simpler to reason about forkIO.

like image 43
John L Avatar answered Oct 30 '22 18:10

John L