Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are Haskell list comprehensions not executed in parallel?

I am doing Project Euler problem 21 for homework and I have this list comprehension:

amicableNumberSums = [ x+y | x<-[1..10000], y <-[1..10000], (amicable x y)]

This takes a very long time to execute (understandable as it tests 10000^2 pairs of numbers) and looking at my cpu usage it shows that only 1 core is being used.

As there are no side effects to the list comprehension there is no danger in multiple pairs of numbers being tested at the same time. Is there a way to make Haskell do this automatically or if not how could my code be modified to do this?

(Edit) Error when running print (amicableNumberSums using parList):

Couldn't match type `a0 -> Eval a0' with `[Int]'
Expected type: Strategy [Int]
  Actual type: Strategy a0 -> Strategy [a0]
In the second argument of `using', namely `parList'
In the first argument of `print', namely
  `(amicableNumberSums `using` parList)'
In the expression: print (amicableNumberSums `using` parList)

(Edit) Performance of the two suggested methods:

Ørjan Johansen's method: 218.79s elapsed parallel (4 cores + 4 hyperthreading)
                         279.37s elapsed sequential (single core)
bheklilr's method: 247.82s elapsed parallel (4 cores + 4 hyperthreading)
                   274.10s elapsed sequential (single core)
Original method: 278.69s elapsed

This is not as large a speed up as I was hoping for but I now have the correct answer to the problem and until I have learnt some more Haskell this is sufficient.

like image 715
bradleyjkemp Avatar asked Sep 08 '14 20:09

bradleyjkemp


People also ask

What is a list comprehension in Haskell?

List comprehension in Haskell is a way to produce the list of new elements from the generator we have passed inside it. Also for the generator values, we can apply the Haskell functions to modify it later. This list comprehension is very y easy to use and handle for developers and beginners as well.


2 Answers

Here's a simple example:

simple = 1 : 1 : [a + b | a <- simple, b <- simple]

How would you parallelize this? How would you generalize this to any list comprehension to decide if it can be parallelized? What about any other list comprehension over a list that is infinite, sparking a new thread for each element would mean sparking infinite threads. What if it's actually much faster to compute the list sequentially as the threading overhead is too large and slows down the computation? What if only the first 10 elements of the list are needed? It wouldn't be very good to compute the whole list greedily when a small fraction is required.

Instead, GHC chose to give the power to the programmer on when and how to parallelize a list computation. Instead of doing things implicitly for you, you can choose how it's done using the Control.Parallel.Strategies module:

print $ amicableNumberSums `using` parList rdeepseq

Or calculating chunks of the list in parallel:

print $ amicableNumberSums `using` parListChunk 64 rdeepseq

Keep in mind that you'll have to use seq and co. to get your data into NF at the right times.

The API exposed by Control.Parallel.Strategies gives you the ability to define different ways of computing a pure data structure in parallel entirely separate from the data structure itself or even other algorithms. This is in stark contrast to most other programming languages which force you to strongly couple your parallelism with your other algorithms, or even how the structure is constructed. I'd highly recommend reading Simon Marlow's Parallel and Concurrent Haskell (it's free online!), which does a much better job than I can at explaining how it works and how to use it.

like image 121
bheklilr Avatar answered Sep 19 '22 15:09

bheklilr


@bheklilr's answer handles the general method of parallelizing strategies, but as I implied in the comment above, the way the original list comprehension is written forces all amicable tests to happen before a parList-based strategy on it can get to its elements and start evaluating them, so I don't think @bheklilr's code will quite work for this specific case.

Here's my suggestion for an improvement. You need to rewrite your list comprehension such that it divides the work into well-defined and independent chunks, e.g. by combining the x values in intermediate lists. For example it can be written equivalently as

concat [[ x+y | y <-[1..10000], (amicable x y)] | x <- [1..10000]]

Now you can put a parallel evaluation strategy in between the comprehension and the final concat:

concat ([[ x+y | y <-[1..10000], (amicable x y)] | x <- [1..10000]]
        `using` parList rdeepseq)

(I use rdeepseq here because the elements of the pre-concatted list are also lists, and we want to evaluate all their elements which are integers.)

like image 44
Ørjan Johansen Avatar answered Sep 19 '22 15:09

Ørjan Johansen