Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to limit the number of threads created for an asynchronous Seq.map operation in F#?

The current setup goes something like this

array
|> Seq.map (fun item -> async { return f item})
|> Async.Parallel
|> Async.RunSynchronously

The problem is, this tends to create too many threads and crash the application periodically.

How to limit the number of threads in this case (to, say, Environment.ProcessorCount)?

like image 537
Alexander Avatar asked Sep 17 '10 22:09

Alexander


2 Answers

Since 2018 pull request there is a built-in option in F# Core via a second overload of Async.Parallel F# doc

array
|> Seq.map (fun item -> async { return f item})
|> fun computations -> Async.Parallel(computations, maxDegreeOfParallelism = 20)
|> Async.RunSynchronously
like image 158
HTC Avatar answered Sep 18 '22 17:09

HTC


If you want to parallelize CPU-intensive calculation that takes an array (or any sequence) as an input, then it may be a better idea to use PSeq module from the F# PowerPack (which is available only on .NET 4.0 though). It provides a parallel versions of many standard Array.xyz functions. For more information, you can also look at F# translation of Parallel Programming with .NET samples.

The code to solve your problem would be a bit simpler than using workflows:

array |> PSeq.map f
      |> PSeq.toArray 

Some differences between the two options are:

  • PSeq is created using Task Parallel Library (TPL) from .NET 4.0, which is optimized for working with a large number of CPU-intensive tasks.
  • Async is implemented in F# libraries and supports asynchronous (non-blocking) operations such as I/O in the concurrently running operations.

In summary, if you need asynchronous operations (e.g. I/O) then Async is the best option. If you have a large number of CPU-intensive tasks, then PSeq may be a better choice (on .NET 4.0)

like image 43
Tomas Petricek Avatar answered Sep 18 '22 17:09

Tomas Petricek