I have a pair of unsigned int32
std::atomic<u32> _start;
std::atomic<u32> _end;
Sometimes I want to set start or end with compare exchange, so I don't want spurious failures that could be caused by using CAS on the entire 64bit pair. I just want to use 32 bit CAS.
_end.compare_exchange_strong(old_end, new_end);
Now I could fetch both start and end as one atomic 64bit read. Or two separate 32 bit reads. Wouldn't it be faster to do one 64bit atomic fetch (with the compiler adding the appropriate memory fence) rather than two separate 32 atomic bit reads with two memory fences (or would the compiler optimize that away?)
If so, how would I do that in c++11?
The standard doesn't guarantee that std::atomics have the same size as the underlying type, nor that the operations on atomic are lockfree (although they are likely to be for uint32 at least). Therefore I'm pretty sure there isn't any conforming way to combine them into one 64bit atomic operation. So you need to decide whether you want to manually combine the two variables into a 64bit one (and only work with 64bit operations) or not. 
As an example the platform might not support 64bit CAS (for x86 that was added with the first Pentium IIRC, so it would not be availible when compiling 486 compatible. In that case it needs to lock somehow, so the atomic might contain both the 64bit variable and the lock. Of 
Concerning the fences: Well that depends on the memory_order you specify for your operation. If the memory order specifies that the two operations need to be visible in the order they are excuted in, the compiler will obviously not be able to optimize a fence away, otherwise it might. Of course assuming you target x86 only memory_order_seq_cst will actually emit a barrierinstruction from what I remember, so anything less would impede with instruction reordering done by the compiler, but wouldn't have an actual penalty).
Of course depending on your platform you might get away with treating two std::atomic<int32> as one of int64 doing the casting either via union or reinterpret_cast, just be advised that this behaviour is not required by the standard and can (at least theoretically) stop working at anytime (new compiler verison, different optimization settings,...)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With