CRITICAL_SECTION locking (enter) and unlocking (leave) are efficient because CS testing is performed in user space without making the kernel system call that a mutex makes. Unlocking is performed entirely in user space, whereas ReleaseMutex requires a system call.
I just read these sentences in this book.
What the kernel system call mean? Could you give me the function's name?
I'm a English newbie. I interpreted them like this.
Another question.
The implementation of critical sections in Windows has changed over the years, but it has always been a combination of user-mode and kernel calls.
The CRITICAL_SECTION is a structure that contains a user-mode updated values, a handle to a kernel-mode object - EVENT or something like that, and debug information.
EnterCriticalSection uses an interlocked test-and-set operation to acquire the lock. If successful, this is all that is required (almost, it also updates the owner thread). If the test-and-set operation fails to aquire, a longer path is used which usually requires waiting on a kernel object with WaitForSignleObject
. If you initialized with InitializeCriticalSectionAndSpinCount
then EnterCriticalSection
may spin an retry to acquire using interlocked operation in user-mode.
Below is a diassembly of the "fast" / uncontended path of EnterCriticialSection
in Windows 7 (64-bit) with some comments inline
0:000> u rtlentercriticalsection rtlentercriticalsection+35
ntdll!RtlEnterCriticalSection:
00000000`77ae2fc0 fff3 push rbx
00000000`77ae2fc2 4883ec20 sub rsp,20h
; RCX points to the critical section rcx+8 is the LockCount
00000000`77ae2fc6 f00fba710800 lock btr dword ptr [rcx+8],0
00000000`77ae2fcc 488bd9 mov rbx,rcx
00000000`77ae2fcf 0f83e9b1ffff jae ntdll!RtlEnterCriticalSection+0x31 (00000000`77ade1be)
; got the critical section - update the owner thread and recursion count
00000000`77ae2fd5 65488b042530000000 mov rax,qword ptr gs:[30h]
00000000`77ae2fde 488b4848 mov rcx,qword ptr [rax+48h]
00000000`77ae2fe2 c7430c01000000 mov dword ptr [rbx+0Ch],1
00000000`77ae2fe9 33c0 xor eax,eax
00000000`77ae2feb 48894b10 mov qword ptr [rbx+10h],rcx
00000000`77ae2fef 4883c420 add rsp,20h
00000000`77ae2ff3 5b pop rbx
00000000`77ae2ff4 c3 ret
So the bottom line is that if the thread does not need to block it will not use a system call, just an interlocked test-and-set operation. If blocking is required, there will be a system call. The release path also uses an interlocked test-and-set and may require a system call if other threads are blocked.
Compare this to Mutex which always requires a system call NtWaitForSingleObject
and NtReleaseMutant
Calling to the kernel requires a context switch, which is takes a small (but measurable) performance hit for every context switch. The function in question is ReleaseMutex()
itself.
The critical section functions are available in kernel32.dll
(at least from the caller's point of view - see comments for discussion about ntdll.dll
) and can often avoid making any calls into the kernel.
It is worthwhile to know that Mutex objects can be accessed from different processes at the same time. On the other hand, CRITICAL_SECTION
objects are limited to one process.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With