I am doing Project Euler problem 21 for homework and I have this list comprehension:
amicableNumberSums = [ x+y | x<-[1..10000], y <-[1..10000], (amicable x y)]
This takes a very long time to execute (understandable as it tests 10000^2 pairs of numbers) and looking at my cpu usage it shows that only 1 core is being used.
As there are no side effects to the list comprehension there is no danger in multiple pairs of numbers being tested at the same time. Is there a way to make Haskell do this automatically or if not how could my code be modified to do this?
(Edit) Error when running print (amicableNumberSums using
parList):
Couldn't match type `a0 -> Eval a0' with `[Int]'
Expected type: Strategy [Int]
Actual type: Strategy a0 -> Strategy [a0]
In the second argument of `using', namely `parList'
In the first argument of `print', namely
`(amicableNumberSums `using` parList)'
In the expression: print (amicableNumberSums `using` parList)
(Edit) Performance of the two suggested methods:
Ørjan Johansen's method: 218.79s elapsed parallel (4 cores + 4 hyperthreading)
279.37s elapsed sequential (single core)
bheklilr's method: 247.82s elapsed parallel (4 cores + 4 hyperthreading)
274.10s elapsed sequential (single core)
Original method: 278.69s elapsed
This is not as large a speed up as I was hoping for but I now have the correct answer to the problem and until I have learnt some more Haskell this is sufficient.
List comprehension in Haskell is a way to produce the list of new elements from the generator we have passed inside it. Also for the generator values, we can apply the Haskell functions to modify it later. This list comprehension is very y easy to use and handle for developers and beginners as well.
Here's a simple example:
simple = 1 : 1 : [a + b | a <- simple, b <- simple]
How would you parallelize this? How would you generalize this to any list comprehension to decide if it can be parallelized? What about any other list comprehension over a list that is infinite, sparking a new thread for each element would mean sparking infinite threads. What if it's actually much faster to compute the list sequentially as the threading overhead is too large and slows down the computation? What if only the first 10 elements of the list are needed? It wouldn't be very good to compute the whole list greedily when a small fraction is required.
Instead, GHC chose to give the power to the programmer on when and how to parallelize a list computation. Instead of doing things implicitly for you, you can choose how it's done using the Control.Parallel.Strategies
module:
print $ amicableNumberSums `using` parList rdeepseq
Or calculating chunks of the list in parallel:
print $ amicableNumberSums `using` parListChunk 64 rdeepseq
Keep in mind that you'll have to use seq
and co. to get your data into NF at the right times.
The API exposed by Control.Parallel.Strategies
gives you the ability to define different ways of computing a pure data structure in parallel entirely separate from the data structure itself or even other algorithms. This is in stark contrast to most other programming languages which force you to strongly couple your parallelism with your other algorithms, or even how the structure is constructed. I'd highly recommend reading Simon Marlow's Parallel and Concurrent Haskell (it's free online!), which does a much better job than I can at explaining how it works and how to use it.
@bheklilr's answer handles the general method of parallelizing strategies, but as I implied in the comment above, the way the original list comprehension is written forces all amicable
tests to happen before a parList
-based strategy on it can get to its elements and start evaluating them, so I don't think @bheklilr's code will quite work for this specific case.
Here's my suggestion for an improvement. You need to rewrite your list comprehension such that it divides the work into well-defined and independent chunks, e.g. by combining the x
values in intermediate lists. For example it can be written equivalently as
concat [[ x+y | y <-[1..10000], (amicable x y)] | x <- [1..10000]]
Now you can put a parallel evaluation strategy in between the comprehension and the final concat
:
concat ([[ x+y | y <-[1..10000], (amicable x y)] | x <- [1..10000]]
`using` parList rdeepseq)
(I use rdeepseq
here because the elements of the pre-concatted list are also lists, and we want to evaluate all their elements which are integers.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With