I've been trying to find info on performance of using float vs double on graphics hardware. I've found plenty of info on float vs double on CPUs, but such info is more scarce for GPUs.
I code with OpenGL, so if there's any info specific to that API that you feel should be known, let's have at it.
I understand that if the program is moving a lot of data to/from the graphics hardware, then it would probably be better to use floats as doubles would require twice the bandwidth. My inquiries are more towards how the graphics hardware does it's processing. As I understand it, modern Intel CPUs convert float/double to an 80-bit real for calculations (SSE instructions excluded) and both types are thus about equally fast. Do modern graphics cards do any such thing? is float and double performance about equal now? Are there any strong reasons to use one over the other?
Double is more precise than float and can store 64 bits, double of the number of bits float can store. Double is more precise and for storing large numbers, we prefer double over float. For example, to store the annual salary of the CEO of a company, double will be a more accurate choice.
Difference in Precision (Accuracy) float and double both have varying capacities when it comes to the number of decimal digits they can hold. float can hold up to 7 decimal digits accurately while double can hold up to 15.
Floats are faster than doubles when you don't need double's precision and you are memory-bandwidth bound and your hardware doesn't carry a penalty on floats. They conserve memory-bandwidth because they occupy half the space per number.
Both double-type and float-type can be used to represent floating-point numbers in Java. A double-type is preferred over float-type if the more precise and accurate result is required. The precision of double-type is up to 15 to 16 decimal points while the precision of float type is only around 6 to 7 decimal digits.
In terms of speed, GPUs are optimized for floats. I'm much more familiar with Nvidia hardware, but in current generation hardware, there is 1 DP FPU for every 8 SP FPU. In next generation hardware, they're expected to have more of a 1 to 2 ratio instead.
My recommendation would be to see if your algorithm needs double precision. Many algorithms don't really need the extra bits. Run some tests to determine the average error that you get by going to single precision and figure out if it's significant. If not, just use single.
If your algorithm is purely for graphics, you probably don't need double precision. If you are doing general purpose computation, consider using OpenCL or CUDA.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With