CUDA __umul24 function, useful or not?

Question

Is worth replacing all multiplications with the __umul24 function in a CUDA kernel? I read different and opposite opinions and I can't still make a bechmark to figure it out

fabrizioM · Accepted Answer

Only in devices with architecture prior to fermi, that is with cuda capabilities prior to 2.0 where the integer arithmetic unit is 24 bit.

On Cuda Device with capabilities >= 2.0 the architecture is 32 bit the _umul24 will be slower instead of faster. The reason is because it has to emulate the 24 bit operation with 32 bit architecture.

The question is now: Is it worth the effort for the speed gain ? Probably not.

CUDA __umul24 function, useful or not?

Tags:

cuda

multiplication

Marco A.

1 Answers

fabrizioM

Recent Activity

Donate For Us

CUDA __umul24 function, useful or not?

Tags:

cuda

multiplication

Marco A.

1 Answers

fabrizioM

Related questions

Recent Activity

Donate For Us