Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++0x atomic implementation in c++98 question about __sync_synchronize()

I have written the followin atomic template with a view to mimicing the atomic operations which will be available in the upcoming c++0x standard.

However, I am not sure that the __sync_synchronize() call I have around the returning of the underlying value are necessary.

From my understanding, __sync_synchronize() is a full memory barrier and I'm not sure I need such a costly call when returning the object value.

I'm pretty sure it'll be needed around the setting of the value but I could also implement this with the assembly ..

__asm__ __volatile__ ( "rep;nop": : :"memory" );

Does anyone know wether I definitely need the synchronize() on return of the object.

M.

template < typename T >
struct atomic
{
private:
    volatile T obj;

public:
    atomic( const T & t ) :
        obj( t )
    {
    }

    inline operator T()
    {
        __sync_synchronize();   // Not sure this is overkill
        return obj;
    }

    inline atomic< T > & operator=( T val )
    {
        __sync_synchronize();   // Not sure if this is overkill
        obj = val;
        return *this;
    }

    inline T operator++()
    {
        return __sync_add_and_fetch( &obj, (T)1 );
    }

    inline T operator++( int )
    {
        return __sync_fetch_and_add( &obj, (T)1 );
    }

    inline T operator+=( T val )
    {
        return __sync_add_and_fetch( &obj, val );
    }

    inline T operator--()
    {
        return __sync_sub_and_fetch( &obj, (T)1 );
    }

    inline T operator--( int )
    {
        return __sync_fetch_and_sub( &obj, (T)1 );
    }

    inline T operator-=( T )
    {
        return __sync_sub_and_fetch( &obj, val );
    }

    // Perform an atomic CAS operation
    // returning the value before the operation
    inline T exchange( T oldVal, T newVal )
    {
        return __sync_val_compare_and_swap( &obj, oldval, newval );
    }

};

Update: I want to make sure that the operations are consistent in the face of read/write re-ordering due to compiler optimisations.

like image 979
ScaryAardvark Avatar asked Mar 11 '10 08:03

ScaryAardvark


People also ask

What does __ Sync_synchronize do?

__sync_synchronize This function synchronizes data in all threads. A full memory barrier is created when this function is invoked.

How do you declare an atomic variable in C++?

In order to solve this problem, C++ offers atomic variables that are thread-safe. The atomic type is implemented using mutex locks. If one thread acquires the mutex lock, then no other thread can acquire it until it is released by that particular thread.

What are atomic data types in C++?

There are four primitive atomic data types: booleans, integers, characters and floats.

Is ++ an atomic operation C++?

x++ in C and C++ doesn't have atomic behavior.


1 Answers

First, some petty remarks:

volatile T obj;

volatile is useless here, even more that you make all the barriers yourself.

inline T operator++( int )

inline is unneeded, since it is implied when the method is defined inside the class.

Getters and setters:

inline operator T()
{
    __sync_synchronize();   // (I)
    T tmp=obj;
    __sync_synchronize();   // (II)
    return tmp;
}

inline atomic< T > & operator=( T val )
{
    __sync_synchronize();   // (III)
    obj = val;
    __sync_synchronize();   // (IV)
    return *this;
}

To assure total ordering of the memory accesses on read and write, you need two barriers on each access (like this). I would be happy with only barriers (II) and (III) as they suffice for some uses I came up with (eg. pointer/boolean saying data is there, spinlock), but, unless specified otherwise, I would not omit the others, because someone might need them (it would be nice if someone showed you can omit some of the barriers without restricting possible uses, but I don't think it's possible).

Of course, this would be unnecessarily complicated and slow.

That said, I would just dump the barriers, and even the idea of using the barriers in any place of a similar template. Note that:

  • the ordering semantics of that interface is all defined by you; and if you decide the interface has the barriers here or there, they must be here or there, period. If you don't define it, you can come up with more efficient design, because not all barriers, or even not full barriers, might be needed for a particular problem.
  • usually, you use atomics if you have a lock-free algorithm that could give you a performance advantage; this means an interface that prematurely pessimizes the accesses will probably be unusable as a building block of it, as it will hamper the performance itself.
  • lock-free algorithms typically contain communication that cannot be encapsulated by one atomic data type, so you need to know what's happening in the algorithm to place the barriers precisely where they belong (eg. when implementing a lock, you need a barrier after you've acquired it, but before you release it, which are both writes, at least in principle)
  • if you don't wanna have problems, and are not sure about placing the barriers explicitly in the algorithm, just use lock-based algorithms. There's nothing bad about it.

BTW, the c++0x interface allows you to specify precise memory ordering constraints.

like image 168
jpalecek Avatar answered Oct 16 '22 06:10

jpalecek