Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Atomic increment of 64 bit variable on 32 bit environment

Writing an answer for another question some interesting things came out and now I can't understand how Interlocked.Increment(ref long value) works on 32 bit systems. Let me explain.

Native InterlockedIncrement64 is now not available when compiling for 32 bit environment, OK, it makes sense because in .NET you can't align memory as required and it may be called from managed then they dropped it.

In .NET we can call Interlocked.Increment() with a reference to a 64 bit variable, we still don't have any constraint about its alignment (for example in a structure, also where we may use FieldOffset and StructLayout) but documentation doesn't mention any limitation (AFAIK). It's magic, it works!

Hans Passant noted that Interlocked.Increment() is a special method recognized by JIT compiler and it will emit a call to COMInterlocked::ExchangeAdd64() which will then call FastInterlockExchangeAddLong which is a macro for InterlockedExchangeAdd64 which shares same limitations of InterlockedIncrement64.

Now I'm perplex.

Forget for one second managed environment and go back to native. Why InterlockedIncrement64 can't work but InterlockedExchangeAdd64 does? InterlockedIncrement64 is a macro, if intrinsics aren't available and InterlockedExchangeAdd64 works then it may be implemented as a call to InterlockedExchangeAdd64...

Let's go back to managed: how an atomic 64 bit increment is implemented on 32 bit systems? I suppose sentence "This function is atomic with respect to calls to other interlocked functions" is important but still I didn't see any code (thanks Hans to point out to deeper implementation) to do it. Let's pick InterlockedExchangedAdd64 implementation from WinBase.h when intrinsics aren't available:

FORCEINLINE
LONGLONG
InterlockedExchangeAdd64(
    _Inout_ LONGLONG volatile *Addend,
    _In_    LONGLONG Value
    )
{
    LONGLONG Old;

    do {
        Old = *Addend;
    } while (InterlockedCompareExchange64(Addend,
                                          Old + Value,
                                          Old) != Old);

    return Old;
}

How can it be atomic for reading/writing?

like image 642
Adriano Repetti Avatar asked Jun 01 '16 07:06

Adriano Repetti


1 Answers

You have to keep following the trail, InterlockedExchangeAdd64() takes you to the WinNt.h SDK header file. Where you'll see many versions of it, depending on the target architecture.

This generally collapses to:

#define InterlockedExchangeAdd64 _InterlockedExchangeAdd64

Which passes the buck to a compiler intrinsic, declared in vc/include/intrin.h and implemented by the compiler's back-end.

Or in other words, different builds of the CLR will have different implementations of it. There have been many over the years, x86, x64, Itanium, ARM, ARM8, PowerPC off the top of my head, I'm surely missing some that used to boot WindowsCE before Apple made it irrelevant. For x86 this ultimately is taken care of by LOCK CMPXCHNG8B, a dedicated processor instruction that can handle misaligned 64-bit variables. I don't have the hardware to see what it looks like on other 32-bit processors.

Do keep in mind that the target architecture for managed code is not nailed down at compile time. It is the jitter that adapts the MSIL to the target at runtime. That isn't quite so relevant for C++/CLI projects since you generally do have to pick a target if you compile with /clr instead of /clr:pure and only x86 and x64 can work. But the plumbing is in place anyway so a macro just isn't very useful.

like image 153
Hans Passant Avatar answered Sep 23 '22 15:09

Hans Passant