I have been learning F# recently, being particularly interested in its ease of exploiting data parallelism. The data |> Array.map |> Async.Parallel |> Async.RunSynchronously
idiom seems very easy to understand and straightforward to use and get real value from.
So why is it that async
is not really intended for this? Donald Syme himself says that PLINQ and Futures are probably a better choice. And other answers I've read here agree with that as well as recommending TPL. (PLINQ doesn't seem too much different to the above built-in functions, as long as you're using the F# Powerpack to get the PSeq
functions.)
F# and functional languages make a lot of sense for this, and some applications have achieved great success with async
parallelism.
So why shouldn't I use async
to execute parallel data processes? What am I going to lose by writing parallel async
code instead of using PLINQ or TPL?
It is basic Strunk and White to avoid such words to give emphasis where none is required. Yet, the F-word is commonly used in this context (e.g., "f-ing awesome!") and turns an unnecessary yet otherwise innocuous grammatical error into an unnecessary, offensive one. The word is unprofessional.
The f-word can be a very hurtful, offensive, mean, and/or vulgar word. However, in adults' conversations and certain situations, it can serve as a comic relief, a colorful method of expressing frustration, surprise, astonishment and friendly-exclamation amongst many other things.
Just say more appropriate words rather than the really offensive ones. For example, instead of saying the F word, say, "Flipping" or "Freaking" or "Fudge" or "Frickin", and for the S word, "sugar", "shoot", "shiz", "shingles", "crap" or "crud."
developed from the notion of "invoke sacred names." Thus you make use of a bad language, in the original sense, not because of coarseness, but determined by the misuse of it.
So why shouldn't I use async to execute parallel data processes?
If you have a tiny number of completely independent non-async
tasks and lots of cores then there is nothing wrong with using async to achieve parallelism. However, if your tasks are dependent in any way or you have more tasks than cores or you push the use of async
too far into the code then you will be leaving a lot of performance on the table and could do a lot better by choosing a more appropriate foundation for parallel programming.
Note that your example can be written even more elegantly using the TPL from F# though:
Array.Parallel.map f xs
What am I going to lose by writing parallel async code instead of using PLINQ or TPL?
You lose the ability to write cache oblivious code and, consequently, will suffer from lots of cache misses and, therefore, all cores stalling waiting for shared memory which means poor scalability on a multicore.
The TPL is built upon the idea that child tasks should execute on the same core as their parent with a high probability and, therefore, will benefit from reusing the same data because it will be hot in the local CPU cache. There is no such assurance with async.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With