I'm interested in implementing an algorithm on the GPU using HLSL, but one of my main concerns is that I would like a variable level of precision. Are there techniques out there to emulate 64bit precision and higher that could be implemented on the GPU.
Thanks!
GPUs are just beginning to support double precision in hardware, though it will continue to be much slower than single precision in the near future. There are a wide variety of techniques that have been developed over the years to synthesize higher-accuracy floating point using a representation composed of multiple floats in whatever precision has fast hardware support, but the overhead is pretty substantial. IIRC, the crlibm manual has a pretty good discussion of some of these techniques, with error analysis and pseudocode (CRLIBM uses them to represent numbers as more than one double-precision value, but the same techniques can be used with single)
Without knowing more about what you're trying to do, it's hard to give a better answer. For some algorithms, only one small part of the computation needs high accuracy; if you're in a case like that, it might be possible for you to get decent performance on the GPU, though the code won't necessarily be very pretty or easy to work with. If you need high precision pervasively throughout your algorithm, then the GPU probably isn't an attractive option for you at the moment.
Finally, why HLSL and not a compute-oriented language like CUDA or OpenCL?
Using two floats (i.e. single precision values), you can achieve about 56-bits of precision. This approaches the precision of a double, but many of the operations you can implement for this "double single" data type are slow and are less precise than using doubles. However, for simple arithmetic operations, they are usually sufficient.
This paper talks a bit about the idea and describes how to implement the multiplication operation. For a more complete list of operations you can perform and how to implement them, check out the DSFUN90 package here. The package is written in Fortran 90, but can be translated to anything that has single precision numbers. Be aware though that you must license library from them to use it for commercial purposes. I believe the Mersenne-Twister CUDA demo application also has implementations for addition and multiplication operations.
This is a slightly off-topic answer, but if you want to see how your problem is going to be impacted by switching some operations to single-precision arithmetic, you should think about using interval arithmetic to empirically measure the uncertainty boundaries when you mix precision in various ways. Boost has an interval arithmetic library that I once used to instrument an existing C++ scientific code: it was quite easy to use.
But be warned: interval arithmetic is notoriously pessimistic: i.e. it sometimes exaggerates bounds. Affine arithmetic is supposed to be better, but I never found a usable library for that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With