I was looking at the code of parBuffer in parallel-3.2.0.4 but I am missing something on how it works. I don't see how can it create new sparks aside from the initial ones. As far as I can see it's using start in parBufferWHNF to force the first n to be sparked with par, and then going through ret it's using par again on the same entries (shouldn't this just discard y and not risk to get the spark GC'd?) while returning the corresponding result? and then it's returning directly xs, without any additional spark creation as rdeepseq is just calling pseq.
But clearly testing code like this
withStrategy (parBuffer 10 rdeepseq) $ take 100 [ expensive stuff ]
I can see all the 100 sparks in the ghc RTS informations, but where are the other 90 created?
Here is the code I was looking at:
parBufferWHNF :: Int -> Strategy [a]
parBufferWHNF n0 xs0 = return (ret xs0 (start n0 xs0))
where -- ret :: [a] -> [a] -> [a]
ret (x:xs) (y:ys) = y `par` (x : ret xs ys)
ret xs _ = xs
-- start :: Int -> [a] -> [a]
start 0 ys = ys
start !_n [] = []
start !n (y:ys) = y `par` start (n-1) ys
-- | Like 'evalBuffer' but evaluates the list elements in parallel when
-- pushing them into the buffer.
parBuffer :: Int -> Strategy a -> Strategy [a]
parBuffer n strat = parBufferWHNF n . map (withStrategy strat)
parBuffer
is conceptually similar to a circular buffer with a constant window size rolling over the input and producing the output and is useful when implementing pipeline parallelism or working with lazy streams.
Its implementation internally depends on how the result is evaluated -- it makes
use of lazyness and graph sharing (which explains why the sparks are not discarded) to produce output as input is consumed ensuring that the number of threads is limited to N
and hence constant space is used (as opposed to parList
which is linear in the length of argument list).
The start
function is used to create the initial N
sparks and pass the rest of the input to ret
unsparked. The ret
function takes two lists (xs0
and xs0
but without the initial N
elements, as returned by start
) and sparks an element
from the second list every time a thread completes (the x
in the result; this actually happens once the user demands the results) until there are no elements left.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With