I've just stated working in haskell semi-explicit parallelism with GHC 6.12. I've write the following haskell code to compute in parallel the map of the fibonnaci function upon 4 elements on a list, and in the same time the map of the function sumEuler upon two elements.
import Control.Parallel
import Control.Parallel.Strategies
fib :: Int -> Int
fib 0 = 0
fib 1 = 1
fib n = fib (n-1) + fib (n-2)
mkList :: Int -> [Int]
mkList n = [1..n-1]
relprime :: Int -> Int -> Bool
relprime x y = gcd x y == 1
euler :: Int -> Int
euler n = length (filter (relprime n) (mkList n))
sumEuler :: Int -> Int
sumEuler = sum . (map euler) . mkList
-- parallel initiation of list walk
mapFib :: [Int]
mapFib = map fib [37, 38, 39, 40]
mapEuler :: [Int]
mapEuler = map sumEuler [7600, 7600]
parMapFibEuler :: Int
parMapFibEuler = (forceList mapFib) `par` (forceList mapEuler `pseq` (sum mapFib + sum mapEuler))
-- how to evaluate in whnf form by forcing
forceList :: [a] -> ()
forceList [] = ()
forceList (x:xs) = x `pseq` (forceList xs)
main = do putStrLn (" sum : " ++ show parMapFibEuler)
to improve my program in parallel I rewrote it with par and pseq and a forcing function to force whnf evaluation. My problem is that by looking in the threadscope it appear that i didn't gain any parallelism. Things are worse because i didn't gain any speedup.
That why I have theses two questions
Question 1 How could I modify my code to exploit any parallelism ?
Question 2 How could I write my program in order to use Strategies (parMap, parList, rdeepseq and so on ...) ?
First improvement with Strategies
according to his contribution
parMapFibEuler = (mapFib, mapEuler) `using` s `seq` (sum mapFib + sum mapEuler) where
s = parTuple2 (seqList rseq) (seqList rseq)
the parallelism appears in the threadscope but not enough to have a significant speedup
The reason you aren't seeing any parallelism here is because your spark has been garbage collected. Run the program with +RTS -s
and note this line:
SPARKS: 1 (0 converted, 1 pruned)
the spark has been "pruned", which means removed by the garbage collector. In GHC 7 we made a change to the semantics of sparks, such that a spark is now garbage collected (GC'd) if it is not referred to by the rest of the program; the details are in the "Seq no more" paper.
Why is the spark GC'd in your case? Look at the code:
parMapFibEuler :: Int
parMapFibEuler = (forceList mapFib) `par` (forceList mapEuler `pseq` (sum mapFib + sum mapEuler))
the spark here is the expression forkList mapFib
. Note that the value of this expression is not required by the rest of the program; it only appears as an argument to par
. GHC knows that it isn't required, so it gets garbage collected.
The whole point of the recent changes to the parallel
package were to let you easily avoid this bear trap. A good Rule of Thumb is to use Control.Parallel.Strategies
rather than par
and pseq
directly. My preferred way to write this would be
parMapFibEuler :: Int
parMapFibEuler = runEval $ do
a <- rpar $ sum mapFib
b <- rseq $ sum mapEuler
return (a+b)
but sadly this doesn't work with GHC 7.0.2, because the spark sum mapFib
is floated out as a static expression (a CAF), and the runtime doesn't think sparks that point to static expressions are worth keeping (I'll fix this). This wouldn't happen in a real program, of course! So let's make the program a bit more realistic and defeat the CAF optimisation:
parMapFibEuler :: Int -> Int
parMapFibEuler n = runEval $ do
a <- rpar $ sum (take n mapFib)
b <- rseq $ sum (take n mapEuler)
return (a+b)
main = do [n] <- fmap (fmap read) getArgs
putStrLn (" sum : " ++ show (parMapFibEuler n))
Now I get good parallelism with GHC 7.0.2. However, note that @John's comments also apply: generally you want to look for more fine-grained parallelism so as to let GHC use all your processors.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With