Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the performance of std::atomic vs non-atomic variables?

I am curious as the the performance of using std::atomic<float> vs a normal float in an application. I am also curious as to what affects this. I often see topics on the performance of atomics vs a mutex, but I have found it harder to find information on atomics vs non-atomics.

I am not using this as a way to choose to make my code thread-safe or not, just wanting to understand the overhead involved.

(EDIT: At this point in the original question I gave an example (see below) that was supposed to be illustrative of a change of implementation rather than ask a specific question about that code. This seemed to confuse people about what I was asking so I have taken it out.)

I basically want to know what the broad factors are that affect performance of std::atomic. Is it the platform? The way they are used? Is it slower to use atomics accessed approximately the same amount by two threads rather than if one thread accesses them 95% of the time and the other one only occasionally?

Also, is there any difference between a std::atomic<int> and a std::atomic<float> in this regard?

Thanks in advance,

Adam


Example from original question:

Basically, I tried making a million floats and writing values to them 200 times. This took 0.87 seconds for me. Once I changed them to std::atomic<float>, this took about 2.5 seconds. So this implies it is about 3x as expensive to use std::atomic<float>.

I tried this but for reading values rather than writing, and found that a normal float and std::atomic<float> take the same amount of time.

But is this affected by other things? If another thread is writing/reading to my atomics, does this slow other reads/writes to the same variable down? Presumably so, but how can I understand this better?

like image 282
Adam Stark Avatar asked Aug 14 '18 17:08

Adam Stark


1 Answers

Atomic stores without ordering parameter (i.e. the default) are expensive because of additional ordering instructions that are issued by the compiler. On X86, a default (sequentially consistent) atomic store to a float would look like this:

atomic<float> f;
f.store(3.14);

gcc produces the following instructions:

0x00000000004006d0 <+0>:     movl   $0x4048f5c3,0x20096a(%rip)        # 0x601044 <f>
0x00000000004006da <+10>:    mfence

The mence instruction is expensive because it ensures direct visibility to other cores (i.e. the causes the store buffer to be flushed).

You can try to run a test without ordering:

f.store(3.14, std::memory_order_relaxed);

This will get rid of the mfence and probably show a significant difference in performance. It is closer, if not equal on some platforms, to non-atomic stores.

is there any difference between a std::atomic<int> and a std::atomic<float> in this regard?

On the assumption that both are lock-free, probably not. The ordering constraints are responsible for the reduced performance.

like image 150
LWimsey Avatar answered Oct 30 '22 20:10

LWimsey