I am curious as the the performance of using std::atomic<float>
vs a normal float
in an application. I am also curious as to what affects this. I often see topics on the performance of atomics vs a mutex, but I have found it harder to find information on atomics vs non-atomics.
I am not using this as a way to choose to make my code thread-safe or not, just wanting to understand the overhead involved.
(EDIT: At this point in the original question I gave an example (see below) that was supposed to be illustrative of a change of implementation rather than ask a specific question about that code. This seemed to confuse people about what I was asking so I have taken it out.)
I basically want to know what the broad factors are that affect performance of std::atomic. Is it the platform? The way they are used? Is it slower to use atomics accessed approximately the same amount by two threads rather than if one thread accesses them 95% of the time and the other one only occasionally?
Also, is there any difference between a std::atomic<int>
and a std::atomic<float>
in this regard?
Thanks in advance,
Adam
Example from original question:
Basically, I tried making a million floats and writing values to them 200 times. This took 0.87 seconds for me. Once I changed them to std::atomic<float>
, this took about 2.5 seconds. So this implies it is about 3x as expensive to use std::atomic<float>
.
I tried this but for reading values rather than writing, and found that a normal float
and std::atomic<float>
take the same amount of time.
But is this affected by other things? If another thread is writing/reading to my atomics, does this slow other reads/writes to the same variable down? Presumably so, but how can I understand this better?
Atomic stores without ordering parameter (i.e. the default) are expensive because of additional ordering instructions that are issued by the compiler.
On X86
, a default (sequentially consistent) atomic store to a float would look like this:
atomic<float> f;
f.store(3.14);
gcc produces the following instructions:
0x00000000004006d0 <+0>: movl $0x4048f5c3,0x20096a(%rip) # 0x601044 <f>
0x00000000004006da <+10>: mfence
The mence
instruction is expensive because it ensures direct visibility to other cores (i.e. the causes the store buffer to be flushed).
You can try to run a test without ordering:
f.store(3.14, std::memory_order_relaxed);
This will get rid of the mfence
and probably show a significant difference in performance. It is closer, if not equal on some platforms, to non-atomic stores.
is there any difference between a
std::atomic<int>
and astd::atomic<float>
in this regard?
On the assumption that both are lock-free, probably not. The ordering constraints are responsible for the reduced performance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With