I have an issue in my use of critical sections. My app has a large number of threads, say 60, which all need access to a global resource. I therefore protect that resource with a critical section. This works perfectly during operation, however when my application shuts down, I trigger the threads to quit, and then destroy the critical section.
The problem comes if some of those threads are waiting on the critical section at exit time, and thus are blocked from quitting themselves.
I've written a wrapper around the windows CriticalSection calls that has an 'Initialised' flag, which I set to true when the crit is created, and set to false when I'm about to leave the crit (both cases are set when inside the crit). This flag is checked before the 'enter crit' wrapper function tries entering the crit, bypassing the request if the flag is false. The flag is also checked the moment any thread successfully enters the crit, making it immediately leave the crit if it's false.
What I to do prior to deleting the crit is set the flag to false, then wait for any waiting threads to: be allowed into the crit; see the Initialised flag is false; then leave the crit (which should be a pretty quick operating for each thread).
I check the number of threads waiting for access to the crit by checking the LockCount inside the CRITICAL_SECTION struct, and waiting until it hits 0 (in XP, that's LockCount - (RecursionCount-1)
; in 2003 server and above, the lock count is ((-1) - (LockCount)) >> 2
), before I then destroy the critical section.
This should be sufficient, however I'm finding that the LockCount reaches 0 when there's still one thread (always just one thread, never more) waiting to enter the crit, meaning if I delete the crit at that point, the other thread subsequently wakes up from waiting on the crit, and causes a crash as the CRITICAL_SECTION object has by that time been destroyed.
If I keep my own internal lock count of threads waiting for access, I have the correct count; however this isn't ideal as I have to increment this count outside of the crit, meaning that value isn't protected and therefore can't be entirely relied upon at any one time.
Does anyone know why the LockCount in the CRITICAL_SECTION struct would be out by 1? If I use my own lock count, then check the CRITICAL_SECTION's lock count after that last thread has exited (and before I destroy the crit), it's still 0...
Or, is there a better way for me to protect the global resource in my app with that many threads, other than a critical section?
This is my wrapper struct:
typedef struct MY_CRIT {
BOOL Initialised;
CRITICAL_SECTION Crit;
int MyLockCount;
}
Here's my Crit init function:
BOOL InitCrit( MY_CRIT *pCrit )
{
if (pCrit)
{
InitializeCriticalSection( &pCrit->Crit );
pCrit->Initialised = TRUE;
pCrit->MyLockCount = 0;
return TRUE;
}
// else invalid pointer
else
return FALSE;
}
This is my enter crit wrapper function:
BOOL EnterCrit( MY_CRIT *pCrit )
{
// if pointer valid, and the crit is initialised
if (pCrit && pCrit->Initialised)
{
pCrit->MyLockCount++;
EnterCriticalSection( &pCrit->Crit );
pCrit->MyLockCount--;
// if still initialised
if (pCrit->Initialised)
{
return TRUE;
}
// else someone's trying to close this crit - jump out now!
else
{
LeaveCriticalSection( &pCrit->Crit );
return FALSE;
}
}
else // crit pointer is null
return FALSE;
}
And here's my FreeCrit wrapper function:
void FreeCrit( MY_CRIT *pCrit )
{
LONG WaitingCount = 0;
if (pCrit && (pCrit->Initialised))
{
// set Initialised to FALSE to stop any more threads trying to get in from now on:
EnterCriticalSection( &pCrit->Crit );
pCrit->Initialised = FALSE;
LeaveCriticalSection( &pCrit->Crit );
// loop until all waiting threads have gained access and finished:
do {
EnterCriticalSection( &pCrit->Crit );
// check if any threads are still waiting to enter:
// Windows XP and below:
if (IsWindowsXPOrBelow())
{
if ((pCrit->Crit.LockCount > 0) && ((pCrit->Crit.RecursionCount - 1) >= 0))
WaitingCount = pCrit->Crit.LockCount - (pCrit->Crit.RecursionCount - 1);
else
WaitingCount = 0;
}
// Windows 2003 Server and above:
else
{
WaitingCount = ((-1) - (pCrit->Crit.LockCount)) >> 2;
}
// hack: if our own lock count is higher, use that:
WaitingCount = max( WaitingCount, pCrit->MyLockCount );
// if some threads are still waiting, leave the crit and sleep a bit, to give them a chance to enter & exit:
if (WaitingCount > 0)
{
LeaveCriticalSection( &pCrit->Crit );
// don't hog the processor:
Sleep( 1 );
}
// when no other threads are waiting to enter, we can safely delete the crit (and leave the loop):
else
{
DeleteCriticalSection( &pCrit->Crit );
}
} while (WaitingCount > 0);
}
}
You are responsible to make sure that CS is not in use any longer before destroying it. Let us say that no other thread is currently trying to enter, but there is a chance that it is going to attempt very soon. Now you destroy the CS, what this concurrent thread is going to do? At its full pace it hits deleted critical section causing memory access violation?
The actual solution depends on your current app design, but if you are destroying threads, then you will perhaps want to flag your request to stop those threads, and then wait on theit handles to wait for their destruction. And then complete with deleting critical section when you are sure that threads are done.
Note that it is unsafe to rely on CS member values such as .LockCount
, and having done things right way you will prehaps not even need thing like IsWindowsXPOrBelow
. Critical section API suggest that you use CRITICAL_SECTION
structure as "black box" leaving the internals to be implementation specific.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With