Is Critical Section always faster?

Tags:

I was debugging a multi-threaded application and found the internal structure of CRITICAL_SECTION. I found data member LockSemaphore of CRITICAL_SECTION an interesting one.

It looks like LockSemaphore is an auto-reset event (not a semaphore as the name suggests) and operating system creates this event silently when first time a thread waits on Critcal Section which is locked by some other thread.

Now, I am wondering is Critical Section always faster? Event is a kernel object and each Critical section object is associated with event object then how Critical Section can be faster compared to other kernel objects like Mutex? Also, how does internal event object actually affects the performance of Critical section ?

Here is the structure of the CRITICAL_SECTION:

struct RTL_CRITICAL_SECTION {     PRTL_CRITICAL_SECTION_DEBUG DebugInfo;     LONG LockCount;     LONG RecursionCount;     HANDLE OwningThread;     HANDLE LockSemaphore;     ULONG_PTR SpinCount; };

626

asked May 12 '09 15:05

aJ.

2 Answers

When they say that a critical section is "fast", they mean "it's cheap to acquire one when it isn't already locked by another thread".

[Note that if it is already locked by another thread, then it doesn't matter nearly so much how fast it is.]

The reason why it's fast is because, before going into the kernel, it uses the equivalent of InterlockedIncrement on one of those LONG field (perhaps on the the LockCount field) and if it succeeds then it considers the lock aquired without having gone into the kernel.

The InterlockedIncrement API is I think implemented in user mode as a "LOCK INC" opcode ... in other words you can acquire an uncontested critical section without doing any ring transition into the kernel at all.

115

answered Oct 13 '22 23:10

ChrisW

In performance work, few things fall into the "always" category :) If you implement something yourself that is similar to an OS critical section using other primitives then odds are that will be slower in most cases.

The best way to answer your question is with performance measurements. How OS objects perform is very dependent on the scenario. For example, critical sections are general considered 'fast' if contention is low. They are also considered fast if the lock time is less than the spin count time.

The most important thing to determine is if contention on a critical section is the first order limiting factor in your application. If not, then simply use a critical section normaly and work on your applications primary bottleneck (or necks).

If critical section performance is critical, then you can consider the following.

Carefully set the spin lock count for your 'hot' critical sections. If performance is paramount, then the work here is worth it. Remember, while the spin lock does avoid the user mode to kernel transition, it consumes CPU time at a furious rate - while spinning, nothing else gets to use that CPU time. If a lock is held for long enough, then the spinning thread will actual block, freeing up that CPU to do other work.
If you have a reader/writer pattern then consider using the Slim Reader/Writer (SRW) locks. The downside here is they are only available on Vista and Windows Server 2008 and later products.
You may be able to use condition variables with your critical section to minimize polling and contention, waking threads only when needed. Again, these are supported on Vista and Windows Server 2008 and later products.
Consider using Interlocked Singly Linked Lists (SLIST)- these are efficient and 'lock free'. Even better, they are supported on XP and Windows Server 2003 and later products.
Examine your code - you may be able to break up a 'hot' lock by refactoring some code and using an interlocked operation, or SLIST for synchronization and communication.

In summary - tuning scenarios that have lock contention can be challenging (but interesting!) work. Focus on measuring your applications performance and understanding where your hot paths are. The xperf tools in the Windows Performance Tool kit is your friend here :) We just released version 4.5 in the Microsoft Windows SDK for Windows 7 and .NET Framework 3.5 SP1 (ISO is here, web installer here). You can find the forum for the xperf tools here. V4.5 fully supports Win7, Vista, Windows Server 2008 - all versions.

answered Oct 13 '22 22:10

Foredecker

Related questions
                            
                                Performance of Interlocked.Increment
                            
                                How do I create unique IDs, like YouTube?
                            
                                Changing UITableViewCell textLabel background color to clear
                            
                                Is there a way to unit test an async method?
                            
                                Why was wchar_t invented?
                            
                                OOP: When is it an object?
                            
                                Is there a way to clear all JavaScript timers at once?
                            
                                Excel Range.BorderAround(), Border is always black
                            
                                C++ STL list vs set
                            
                                integer division in php
                            
                                How to post and receive an NSNotifications (Objective C) | Notifications (in Swift)?
                            
                                render_to_string in lib class not working

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With