Shared variable in Haskell parMap

Question

I have a computation which does essentially the following:

f :: [a] -> ([b],Bool)

This function can actually be written

f = foldr h ([],False) . map g
    where h (b,bool) (bs,boolSoFar) = (b:bs,bool || boolSoFar)

where g :: a -> (b,Bool) is some function that takes a lot of time. Also f is usually called on small lists, so it seemed like it might be fun to try computing the map in parallel. This can be accomplished with Control.Parallel.Strategies parMap. So now we use

f = foldr h ([],False) . parMap rseq g
    where h (b,bool) (bs,boolSoFar) = (b:bs, bool || boolSoFar)

This all works great. Now, you'll note that there is a sequential optimization that can be performed in the first definition of f. Namely, I can use map-fold fusion to write it as a single fold, so one loop through the list. However, then I lose the benefits of doing parallel.

Now, one might say that in the second definition of f, looping through list again is not so bad, so why not just do it. I guess what I am thinking is that if Haskell had mutable variables, then one could just, in the body of map, update this boolean variable (I guess you would have to lock and unlock it). Are there any suggestions for doing things like this?

Zemyla · Accepted Answer

What this is going to wind up being is actually a traversal under the lazy writer Applicative with the writer state being Bool, since (False, (||)) forms a monoid. You'll want the unamb package as well, so you can get the value the first time any of the parallel calls of g returns True.

import Control.Parallel.Strategies
import Data.Unamb

newtype EvalWB a = EvalWB { runEvalWB :: Eval (a, Bool) }

instance Functor EvalWB where
  fmap f (EvalWB m) = EvalWB $ fmap (\ ~(a, b) -> (f a, b)) m

instance Applicative EvalWB where
  pure a = EvalWB $ pure (a, False)

  EvalWB mf <*> EvalWB ma = EvalWB $ (\ ~(f, bf) ~(a, ba) -> (f a, por bf ba)) <$> mf <*> ma

And then you have

f :: [a] -> ([b], Bool)
f l = runEval $ runEvalWB $ traverse (\a -> EvalWB $ rpar $ g a) l

This passes over the entire list in parallel, accumulating the values and flags lazily. It uses por to short-circuit on the first True return.

Shared variable in Haskell parMap

Tags:

haskell

parallel-processing

Jonathan Gallagher

1 Answers

Zemyla

Recent Activity

Donate For Us

Shared variable in Haskell parMap

Tags:

haskell

parallel-processing

Jonathan Gallagher

1 Answers

Zemyla

Related questions

Recent Activity

Donate For Us