atomicAdd() for double on GPU

Question

I am doing a project on GPU, and I have to use atomicAdd() for double, because the cuda does not support it for double, so I use the code below, which is NVIDIA provide.

__device__ double atomicAdd(double* address, double val)
{
    unsigned long long int* address_as_ull =
                                          (unsigned long long int*)address;
    unsigned long long int old = *address_as_ull, assumed;
    do {
        assumed = old;
        old = atomicCAS(address_as_ull, assumed, 
                        __double_as_longlong(val + 
                        __longlong_as_double(assumed)));
    } while (assumed != old);
    return __longlong_as_double(old);
}

Now I want to know why the implement require a loop, while (assumed!=old)

talonmies · Accepted Answer

Basically because the implementation requires a load, which can't be performed atomically. The compare-and-swap operation is an atomic version of

(*address == assumed) ? (assumed + val) : *address

There is no guarantee the the value at *address won't change between the cycle that the value is loaded from *address and the cycle that the atomicCAS call is used to store the updated value. If that happens, the value at *address won't be updated. Therefore the loop ensures that the two operations are repeated until there is no change of the value at *address between the read and the compare-and-swap operation, which implies that the update took place.

atomicAdd() for double on GPU

Tags:

atomic

cuda

taoyuanjl

1 Answers

talonmies

Recent Activity

Donate For Us

atomicAdd() for double on GPU

Tags:

atomic

cuda

taoyuanjl

1 Answers

talonmies

Related questions

Recent Activity

Donate For Us