This question ideally solicits the opinion of advanced users.
I'm optimizing some number crunching code and have a few processor cores free. In order to make use of them effectively, I am looking at some fine-grained concurrency using tasks (Task.StartNew) for extremely short-running operations (100 to 500 milliseconds). There is no UI involved here at all.
One crucial element here is use of the ThreadPool which allows performance benefit for short-running tasks without the overhead of thread creation. There are some scenarios, however, where it is more natural to think in loops rather than tasks (e.g. Parallel.For).
I understand that Tasks use the ThreadPool but am not sure about Parallel.For. The question is:
Please note that this question is specific to .NET 4. Upgrading to 4.5 is not an option at the moment.
UPDATE:
Based on Daniel Kinsman's comment, consider the following:
System.Threading.Tasks.Parallel.For(0, 100, i => { System.Console.WriteLine("Is ThreadPool @ low priority: " + System.Threading.Thread.CurrentThread.IsThreadPoolThread); });
System.Diagnostics.Process.GetCurrentProcess().PriorityBoostEnabled = true;
System.Diagnostics.Process.GetCurrentProcess().PriorityClass = System.Diagnostics.ProcessPriorityClass.RealTime;
System.Threading.Thread.CurrentThread.Priority = System.Threading.ThreadPriority.Highest;
System.Threading.Tasks.Parallel.For(0, 100, i => { System.Console.WriteLine("Is ThreadPool @ high priority: " + System.Threading.Thread.CurrentThread.IsThreadPoolThread); });
If you have a multi-core machine, IsThreadPoolThread will return non-deterministic results depending on how many applications you have running.
Parallel.For, by default, 'always' uses threadpool to run tasks and does not incur thread creation overhead.
As of current release of .Net 4.0 CLR, Parallel.For internally uses ThreadPoolTaskScheduler class (an internal class) to run its tasks. Off-course this is internal detail subject to change but in all probability Parallel.For will continue to use ThreadPool to run its tasks. If you don't want to take this 'risk' then you can always write your own TaskScheduler and provide it to Parallel.For using ParallelOptions Class
If you have a multi-core machine, IsThreadPoolThread will return non-deterministic results depending on how many applications you have running.
Explanation:
ThreadPoolTaskScheduler class is an optimized task scheduler that tries not to run a task on a thread if not necessary, i.e. if the task can be executed straight away, it does so on the current thread which might not be a ThreadPool thread. The 'false' output you see is for this case. But if it can not imeediately run the task, it always uses thread pool to queue the task for later execution.
Note that in both the cases there is no thread creation 'overhead' (not entirely true as ThreadPool might itself decide to include new threads in the pool on some heuristic)
You can use the following code to confirm this:
Thread.CurrentThread.Name = "MyThread";
System.Threading.Tasks.Parallel.For(0, 100, i => Console.WriteLine(i + ". Is ThreadPool: " + Thread.CurrentThread.IsThreadPoolThread + " Name: " + Thread.CurrentThread.Name));
You will be able to notice that for all the 'false' cases thread name is 'MyThread'.
Regarding maximizing parallelism, CPU bound work can be parallelised only to the extent of number of CPU cores available. TPL takes this into account and is well suited and optimized for parallel computing of CPU bound tasks. By using TPL you are on right path and should continue on this path :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With