Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Profiling/optimising heavily multithreaded application

I'm writing a performance-critical .NET application which makes heavy use of multithreading.

Using the Visual Studio performance profiler, the top functions with Exclusive samples are:

WaitHandle.WaitAny() - 14.23%

@JIT_MonReliableEnter@8 - 7.76%

Monitor.Enter - 5.09%

Basically, my top 3 functions are working with threading primitives and out of my control to some extent I believe. My work/processing routines are pretty small in comparison and I'm trying to increase performance. I believe the algorithms involved are pretty sound, although I am reviewing them fairly frequently.

My questions are:

  • If there are 14.23% of CPU samples in these methods - is the CPU effectively 'idle' for most of those samples, i.e. just waiting on other threads? Or is the idle part of the thread-waits not shown as a part of the profile trace [and the 27.08% shown in these 3 the sum of all overhead within those sync methods]? (I can guess that this is mostly idle, but would appreciate some decent reference material behind answers to this one please)
  • I have reviewed my locking schemes, however do these results indicate some particular bottleneck or technique I should look into for further optimisation?
  • Is WaitAny quite poor in particular? I use it heavily to check whether particular queue objects are readable/writable, but also checking an abort flag at the same time. Is there a better way to do that?
like image 770
Kieren Johnstone Avatar asked Nov 07 '11 23:11

Kieren Johnstone


People also ask

What are some best examples of multithreaded applications?

Another example of a multithreaded program that we are all familiar with is a word processor. While you are typing, multiple threads are used to display your document, asynchronously check the spelling and grammar of your document, generate a PDF version of the document.

Can multithreading improve performance?

The ultimate goal of multithreading is to increase the computing speed of a computer and thus also its performance. To this end, we try to optimize CPU usage. Rather than sticking with a process for a long time, even when it's waiting on data for example, the system quickly changes to the next task.

Is multithreading always increase the performance of the application?

For a simple task of iterating 100 elements multi-threading the task will not provide a performance benefit. Iterating over 100 billion elements and do processing on each element, then the use of additional CPU's may well help reduce processing time.

What is multithreaded performance?

Multi-Threading is the process by which the processor is able to execute more than one thread simultaneously and threads are lightweight processes. It aims at increasing the processor utilization by using thread level as well as instruction level parallelism.


2 Answers

Your CPU isn't necessarily idle when a thread is in a WaitHandle.WaitAny or a Monitor.Enter. A thread that's in a wait is idle, but presumably other threads are busy executing. This is especially true of Monitor.Enter. If a thread is blocked on a lock, then one would hope the thread that has that lock is executing code rather than sitting idle.

Also, if your thread is using the WaitAny to read from a queue, then it's likely that the queue simply doesn't have anything in it. That's not a performance problem for the consumer code. It just means that the producer isn't putting things into the queue fast enough. Now, that might be because the producer is slow, or because data isn't coming in fast enough.

If you're processing data faster than it can come in, then it doesn't look like you have a performance problem. Certainly not on the consumer side.

As far as using WaitAny for queuing, I would suggest that you use BlockingCollection and the methods that take a cancellation token, like TryAdd(T, Int32, CancellationToken). Converting to cancellation tokens really simplified my multi-threaded queuing code.

like image 62
Jim Mischel Avatar answered Sep 30 '22 04:09

Jim Mischel


The profiling statistics do not include the time when threads were blocked.

The sampling-based profiler basically asks each core to report back after every X (say 1,000,000) non-idle cycles. Each time a core reports back, the profiler remembers the current call stack. The profiling results are reconstructed from the call stacks that the profiler recorded.

From the profiling results, you know that 14.23% of the time a core was doing work, it was executing the instructions in WaitHandle.WaitAny. If your program is CPU-bound, optimizing the WaitAny part (e.g., using a different primitive) could have a significant impact on the performance. However, if the program is not CPU-bound and spends the majority of its time waiting on a server, disk, another process or some other external input, then optimizing the WaitAny-related code will not be very useful.

So, your next step should be figuring out what is the CPU utilization of your program. Also, note the Concurrency Visualizer that Ilian mentioned can be useful to understand how the threads in your program spend their time.

like image 24
Igor ostrovsky Avatar answered Sep 30 '22 06:09

Igor ostrovsky