I made a very simple spinlock using the Interlocked functions in Windows and tested it on a dual-core CPU (two threads that increment a variable);
The program seems to work OK (it gives the same result every time, which is not the case when no synchronization is used), but Intel Parallel Inspector says that there is a race condition at value += j (see the code below). The warning disappears when using Critical Sections instead of my SpinLock.
Is my implementation of SpinLock correct or not ? It's really strange, because all the used operations are atomic and have the proper memory barriers and it shouldn't lead to race conditions.
class SpinLock
{
int *lockValue;
SpinLock(int *value) : lockValue(value) { }
void Lock() {
while(InterlockedCompareExchange((volatile LONG*)lockValue, 1, 0) != 0) {
WaitABit();
}
}
void Unlock() { InterlockedExchange((volatile LONG*)lockValue, 0); }
};
The test program:
static const int THREADS = 2;
HANDLE completedEvents[THREADS];
int value = 0;
int lock = 0; // Global.
DWORD WINAPI TestThread(void *param) {
HANDLE completed = (HANDLE)param;
SpinLock testLock(&lock);
for(int i = 0;i < 1000*20; i++) {
for(int j = 0;j < 10*10; j++) {
// Add something to the variable.
testLock.Lock();
value += j;
testLock.Unlock();
}
}
SetEvent(completed);
}
int main() {
for(int i = 0; i < THREADS; i++) {
completedEvents[i] = CreateEvent(NULL, true, false, NULL);
}
for(int i = 0; i < THREADS; i++) {
DWORD id;
CreateThread(NULL, 0, TestThread, completedEvents[i], 0, &id);
}
WaitForMultipleObjects(THREADS, completedEvents, true, INFINITE);
cout<<value;
}
Parallel Inspector's documentation for data race suggests using a critical section or a mutex to fix races on Windows. There's nothing in it which suggests that Parallel Inspector knows how to recognise any other locking mechanism you might invent.
Tools for analysis of novel locking mechanisms tend to be static tools which look at every possible path through the code, Parallel Inspector's documentation implies that it executes the code once.
If you want to experiment with novel locking mechanisms, the most common tool I've seen used in academic literature is the Spin model checker. There's also ESP, which might reduce the state space, but I don't know if it's been applied to concurrent problems, and also the mobility workbench which would give an analysis if you can couch your problem in pi-calculus. Intel Parallel Inspector doesn't seem anything like as complicated as these tools, but rather designed to check for commonly occurring issues using heuristics.
For other poor folks in a similar situation to me: Intel DOES provide a set of includes and libraries for doing exactly this sort of thing. Check in the Inspector installation directory (you'll see \include, \lib32 and \lib64 in the installation directory) for those materials. Documentation on how to use them (as of June 2018, though Intel cares nothing about keeping links consistent):
https://software.intel.com/en-us/inspector-user-guide-windows-apis-for-custom-synchronization
There are 3 functions:
void __itt_sync_acquired(void *addr)
void __itt_sync_releasing(void *addr)
void __itt_sync_destroy(void *addr)
I'm pretty sure it should be implemented as follows:
class SpinLock
{
long lockValue;
SpinLock(long value) : lockValue(value) { }
void Lock() {
while(InterlockedCompareExchange(&lockValue, 1, 0) != 0) {
WaitABit();
}
}
void Unlock() { InterlockedExchange(&lockValue, 0); }
};
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With