I have written the followin atomic template with a view to mimicing the atomic operations which will be available in the upcoming c++0x standard.
However, I am not sure that the __sync_synchronize() call I have around the returning of the underlying value are necessary.
From my understanding, __sync_synchronize() is a full memory barrier and I'm not sure I need such a costly call when returning the object value.
I'm pretty sure it'll be needed around the setting of the value but I could also implement this with the assembly ..
__asm__ __volatile__ ( "rep;nop": : :"memory" );
Does anyone know wether I definitely need the synchronize() on return of the object.
M.
template < typename T >
struct atomic
{
private:
volatile T obj;
public:
atomic( const T & t ) :
obj( t )
{
}
inline operator T()
{
__sync_synchronize(); // Not sure this is overkill
return obj;
}
inline atomic< T > & operator=( T val )
{
__sync_synchronize(); // Not sure if this is overkill
obj = val;
return *this;
}
inline T operator++()
{
return __sync_add_and_fetch( &obj, (T)1 );
}
inline T operator++( int )
{
return __sync_fetch_and_add( &obj, (T)1 );
}
inline T operator+=( T val )
{
return __sync_add_and_fetch( &obj, val );
}
inline T operator--()
{
return __sync_sub_and_fetch( &obj, (T)1 );
}
inline T operator--( int )
{
return __sync_fetch_and_sub( &obj, (T)1 );
}
inline T operator-=( T )
{
return __sync_sub_and_fetch( &obj, val );
}
// Perform an atomic CAS operation
// returning the value before the operation
inline T exchange( T oldVal, T newVal )
{
return __sync_val_compare_and_swap( &obj, oldval, newval );
}
};
Update: I want to make sure that the operations are consistent in the face of read/write re-ordering due to compiler optimisations.
__sync_synchronize This function synchronizes data in all threads. A full memory barrier is created when this function is invoked.
In order to solve this problem, C++ offers atomic variables that are thread-safe. The atomic type is implemented using mutex locks. If one thread acquires the mutex lock, then no other thread can acquire it until it is released by that particular thread.
There are four primitive atomic data types: booleans, integers, characters and floats.
x++ in C and C++ doesn't have atomic behavior.
volatile T obj;
volatile is useless here, even more that you make all the barriers yourself.
inline T operator++( int )
inline is unneeded, since it is implied when the method is defined inside the class.
inline operator T()
{
__sync_synchronize(); // (I)
T tmp=obj;
__sync_synchronize(); // (II)
return tmp;
}
inline atomic< T > & operator=( T val )
{
__sync_synchronize(); // (III)
obj = val;
__sync_synchronize(); // (IV)
return *this;
}
To assure total ordering of the memory accesses on read and write, you need two barriers on each access (like this). I would be happy with only barriers (II) and (III) as they suffice for some uses I came up with (eg. pointer/boolean saying data is there, spinlock), but, unless specified otherwise, I would not omit the others, because someone might need them (it would be nice if someone showed you can omit some of the barriers without restricting possible uses, but I don't think it's possible).
Of course, this would be unnecessarily complicated and slow.
That said, I would just dump the barriers, and even the idea of using the barriers in any place of a similar template. Note that:
BTW, the c++0x interface allows you to specify precise memory ordering constraints.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With