Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Haskell Nested Vector Parallel Strategy

Similar to this related question, I would like to perform a parallel map on a Vector, but in my case I have a nested Vector, and I can't seem to get the evaluation correct.

Here is what I have so far:

import qualified Data.Vector as V
import qualified Data.Vector.Unboxed as U
import Data.Vector.Strategies 
import Control.DeepSeq

main = do
  let res = genVVec 200 `using` parVector 2
  print res

genUVec :: Int -> U.Vector Int
genUVec n = U.map (ack 2) $ U.enumFromN n 75

genVVec :: Int -> V.Vector (U.Vector Int)
genVVec n = V.map genUVec $ V.enumFromN 0 n

ack :: Int -> Int -> Int
ack 0 n = n+1
ack m 0 = ack (m-1) 1
ack m n = ack (m-1) (ack m (n-1))

instance (NFData a, U.Unbox a) => NFData (U.Vector a) where
  rnf = rnf . U.toList

gives:

$ ./vectorPar +RTS -N8 -s >/dev/null
   SPARKS: 200 (17 converted, 183 pruned)
   Total time    1.37s  (  1.30s elapsed)
$ ./vectorPar +RTS -s >/dev/null
   SPARKS: 200 (0 converted, 200 pruned)
   Total time    1.25s  (  1.26s elapsed)

I have also tried modifying the parVector function in vector-strategies directly, but my attempts are clumsy and ineffective.

I realize REPA was designed for nested workloads, but this seems a simple enough problem, and I'd rather not have to rewrite a lot of code.

like image 407
JKnight Avatar asked Jul 19 '11 16:07

JKnight


2 Answers

Note: Guilty author of vector-strategies here (which is a very small title, seeing as this was just a hacked up function I figured others would find useful).

Your observation that parVector is wrong in that it allows the sparks to be GCed prior to use seems to be correct. The advice by SimonM means I must do precisely what I was trying to avoid, construct a new vector, at some cost, in place of the old one. Knowing this is necessary, there is little reason not to change parVector to the much simpler definition of:

parVector2 :: NFData a => Int -> Strategy (V.Vector a)
parVector2 n = liftM V.fromList . parListChunk n rdeepseq . V.toList

Notice the fix given by John L only works because it "beats" the collector by forcing the computations before collection would occur.

I'll be changing the vector-strategies library so this is unnecessary - making your original code work fine. Unfortunately, this will incur the above-mentioned cost of constructing a new Vector (usually minimal).

like image 103
Thomas M. DuBuisson Avatar answered Nov 08 '22 16:11

Thomas M. DuBuisson


The problem appears to be that parVector doesn't force evaluation of the elements of the vector. Each element remains a thunk and nothing is sparked until the vector is consumed (by being printed), which is too late for the sparks to do work. You can force evaluation of each element by composing the parVector strategy with rdeepseq.

import qualified Data.Vector as V
import qualified Data.Vector.Unboxed as U
import Data.Vector.Strategies
import Control.DeepSeq
import Control.Parallel.Strategies

main = do
  let res = genVVec 200 `using` (rdeepseq `dot` parVector 20)
  print res

genUVec :: Int -> U.Vector Int
genUVec n = U.map (ack 2) $ U.enumFromN n 75

genVVec :: Int -> V.Vector (U.Vector Int)
genVVec n = V.map genUVec $ V.enumFromN 0 n

ack :: Int -> Int -> Int
ack 0 n = n+1
ack m 0 = ack (m-1) 1
ack m n = ack (m-1) (ack m (n-1))

instance (NFData a, U.Unbox a) => NFData (U.Vector a) where
  rnf vec = seq vec ()

instance (NFData a) => NFData (V.Vector a) where
  rnf = rnf . V.toList

I also changed your NFData (U.Vector a) instance. Since a U.Vector is unboxed, evaluation to WHNF is sufficient, and forcing each element via the list conversion is wasteful. In fact the default definition for rnf works fine if you like.

With these two changes, I get the following

SPARKS: 200 (200 converted, 0 pruned)

and the runtime has been reduced by nearly 50%. I also adjusted the vector chunk size to 20, but the result is very similar to a chunk size of 2.

like image 41
John L Avatar answered Nov 08 '22 14:11

John L