Why does CUDA float program get faster in full speed FP64 mode?

Question

My CUDA program uses only float, int, short and char types in its computation. None of the input or output arrays have members of type double. And none of the kernels create any double type inside them during computation.

This program has been compiled using CUDA SDK 5.5 in Release mode using NSight Eclipse. A typical compile line looks like this:

nvcc -O3 -gencode arch=compute_35,code=sm_35 -M -o "src/foo.d" "../src/foo.cu"

I am running this program on a GTX Titan on Linux. To my surprise, I noticed that this program runs 10% faster when I enable the full speed FP64 mode on Titan. This can be done by enabling CUDA Double Precision option in NVIDIA X Server Settings program.

While I am happy for this free speed bonus, I would like to learn the reasons why a CUDA float program could get faster in FP64 mode?

AlexanderKomarov · Accepted Answer

I guess that when you enable the full speed FP64 mode on Titan, more compute units start participating in computation and these FP64 compute units can be used to computing FP32. But enabling large amount of FP64 blocks also slowing clock, so computing getting faster by only 10%.

How to get 10%? When Titan runs in 1/24 FP64 mode, it runs at 837MHz. When it runs in 1/3 FP64 mode, it runs at 725MHz. So (1+1/3)/(1+1/24) * 725/837 = 1.109.

References: http://www.anandtech.com/show/6760/nvidias-geforce-gtx-titan-part-1/4

I found confirmation my guess.

"What's more, the CUDA FP64 block has a very special execution rate: 1/1 FP32."

Reference http://www.anandtech.com/show/5699/nvidia-geforce-gtx-680-review/2

This information for GK104, Titan have GK110. But it's one architecture. So I think that GK110 also have this opportunity.

Why does CUDA float program get faster in full speed FP64 mode?

Tags:

performance

double

cuda

gpu

Ashwin Nanjappa

1 Answers

AlexanderKomarov

Recent Activity

Donate For Us

Why does CUDA float program get faster in full speed FP64 mode?

Tags:

performance

double

cuda

gpu

Ashwin Nanjappa

1 Answers

AlexanderKomarov

Related questions

Recent Activity

Donate For Us