Are there any more efficient ways for atomically adding two floats?

Question

I have a bundle of floats which get updated by various threads. Size of the array is much larger than the number of threads. Therefore simultaneous access on particular floats is rather rare. I need a solution for C++03.

The following code atomically adds a value to one of the floats (live demo). Assuming it works it might be the best solution. The only alternative I can think of is dividing the array into bunches and protecting each bunch by a mutex. But I don't expect the latter to be more efficient.

My questions are as follows. Are there any alternative solutions for adding floats atomically? Can anyone anticipate which is the most efficient? Yes, I am willing to do some benchmarks. Maybe the solution below can be improved by relaxing the memorder constraints, i.e. exchanging __ATOMIC_SEQ_CST by something else. I have no experience with that.

void atomic_add_float( float *x, float add )
{
  int *ip_x= reinterpret_cast<int*>( x ); //1
  int expected= __atomic_load_n( ip_x, __ATOMIC_SEQ_CST ); //2
  int desired;
  do  {
    float sum= *reinterpret_cast<float*>( &expected ) + add; //3
    desired=   *reinterpret_cast<int*>( &sum );
  } while( ! __atomic_compare_exchange_n( ip_x, &expected, desired, //4
                                          /* weak = */ true, 
                                          __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST ) );
}

This works as follows. At //1 the bit-pattern of x is interpreted as an int, i.e. I assume that float and int have the same size (32 bits). At //2 the value to be increased is loaded atomically. At //3 the bit-pattern of the int is interpreted as float and the summand is added. (Remember that expected contains a value found at ip_x == x.) This doesn't change the value under ip_x == x. At //4 the result of the summation is stored only at ip_x == x if no other thread changed the value, i.e. if expected == *ip_x (docu). If this is not the case the do-loop continues and expected contains the updated value found ad ip_x == x.

GCC's functions for atomic access (__atomic_load_n and __atomic_compare_exchange_n) can easily be exchanged by other compiler's implementations.

Andriy Berestovskyy · Accepted Answer

Are there any alternative solutions for adding floats atomically? Can anyone anticipate which is the most efficient?

Sure, there are at least few that come to mind:

Use synchronization primitives, i.e. spinlocks. Will be a bit slower than compare-exchange.
Transactional extension (see Wikipedia). Will be faster, but this solution might limit the portability.

Overall, your solution is quire reasonable: it is fast and yet will work on any platform.

In my opinion the needed memory orders are:

__ATOMIC_ACQUIRE -- when we read the value in __atomic_load_n()
__ATOMIC_RELEASE -- when __atomic_compare_exchange_n() is success
__ATOMIC_ACQUIRE -- when __atomic_compare_exchange_n() is failed

Are there any more efficient ways for atomically adding two floats?

Tags:

c++

multithreading

atomic

c++03

Claas Bontus

1 Answers

Andriy Berestovskyy

Recent Activity

Donate For Us

Are there any more efficient ways for atomically adding two floats?

Tags:

c++

multithreading

atomic

c++03

Claas Bontus

1 Answers

Andriy Berestovskyy

Related questions

Recent Activity

Donate For Us