CUDA

Question

When I use float atomicAdd(float *address, float val) to add a float value smaller than approx. 1e-39 to 0, the addition does not work, and the value at address remains 0.

Here is the simplest code:

__device__ float test[6] = {0};
__global__ void testKernel() {
    float addit = sinf(1e-20);
    atomicAdd(&test[0], addit);
    test[1] += addit;
    addit = sinf(1e-37);
    atomicAdd(&test[2], addit);
    test[3] += addit;
    addit = sinf(1e-40);
    atomicAdd(&test[4], addit);
    test[5] += addit;
}

When I run the code above as testKernel<<<1, 1>>>(); and stop with the debugger I see:

test    0x42697800
    [0] 9.9999997e-21
    [1] 9.9999997e-21
    [2] 9.9999999e-38
    [3] 9.9999999e-38
    [4] 0            
    [5] 9.9999461e-41

Notice the difference between test[4] and test[5]. Both did the same thing, yet the simple addition worked, and the atomic one did nothing at all. What am I missing here?

Update: System info: CUDA 5.5.20, NVidia Titan card, Driver 331.82, Windows 7x64, Nsight 3.2.1.13309.

Robert Crovella · Accepted Answer

atomicAdd is a special instruction that does not necessarily obey the same flush and rounding behaviors that you might get if you specify for example -ftz=true or -ftz=false on other floating point operations (e.g. ordinary fp add)

As documented in the PTX ISA manual:

The floating-point operation .add is a single-precision, 32-bit operation. atom.add.f32 rounds to nearest even and flushes subnormal inputs and results to sign-preserving zero.

So even though ordinary floating point add should not flush denormals to zero if you specify -ftz=false (which is the default, I believe, for nvcc), the floating point atomic add operation to global memory will flush to zero (always).

CUDA - atomicAdd(float) does not add very small values

Tags:

user2412789

1 Answers

Robert Crovella

Recent Activity

Donate For Us