Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do we need 'seq' or 'pseq' with 'par' in Haskell?

I'm trying to understand why we need all parts of the standard sample code:

a `par` b `pseq` a+b

Why won't the following be sufficient?

a `par` b `par` a+b

The above expression seems very descriptive: Try to evaluate both a and b in parallel, and return the result a+b. Is the reason only that of efficiency: the second version would spark off twice instead of once?

How about the following, more succinct version?

a `par` a+b

Why would we need to make sure b is evaluated before a+b as in the original, standard code?

like image 905
kirakun Avatar asked Jan 02 '11 02:01

kirakun


3 Answers

Ok. I think the following paper answers my question: http://community.haskell.org/~simonmar/papers/threadscope.pdf

In summary, the problem with

a `par` b `par` a+b 

and

a `par` a+b

is the lack of ordering of evaluation. In both versions, the main thread gets to work on a (or sometimes b) immediately, causing the sparks to "fizzle" away immediately since there is no more need to start a thread to evaluate what the main thread has already started evaluating.

The original version

a `par` b `pseq` a+b

ensures the main thread works on b before a+b (or else would have started evaluating a instead), thus giving a chance for the spark a to materialize into a thread for parallel evaluation.

like image 132
kirakun Avatar answered Nov 09 '22 05:11

kirakun


a `par` b `par` a+b 

will evaluate a and b in parallel and returns a+b, yes.

However, the pseq there ensures both a and b are evaluated before a+b is.

See this link for more details on that topic.

like image 16
Alp Mestanogullari Avatar answered Nov 09 '22 04:11

Alp Mestanogullari


a `par` b `par` a+b creates sparks for both a and b, but a+b is reached immediately so one of the sparks will fizzle (i.e., it is evaluated in the main thread). The problem with this is efficiency, as we created an unnecessary spark. If you're using this to implement parallel divide & conquer then the overhead will limit your speedup.

a `par` a+b seems better because it only creates a single spark. However, attempting to evaluate a before b will fizzle the spark for a, and as b does not have a spark this will result in sequential evaluation of a+b. Switching the order to b+a would solve this problem, but as code this doesn't enforce ordering and Haskell could still evaluate that as a+b.

So, we do a `par` b `pseq` a+b to force evaluation of b in the main thread before we attempt to evaluate a+b. This gives the a spark chance to materialise before we try evaluating a+b, and we haven't created any unnecessary sparks.

like image 7
Matt Avatar answered Nov 09 '22 04:11

Matt