I need to efficiently add or multiply some constants to a result of type double in a loop to prevent underflow. For example, if we have int, multiplying with a power of 2 will be fast as the compiler will use bit shift. Is there a form of constants for efficient double
addition and multiplication?
Edit: It seems that not many understand my question, apologies for my sloppiness . I will add some code.
If a
is a int, this (multiplying with a power of 2) will be more efficient
int a = 1;
for(...)
for(...)
a *= somefunction() * 1024;
than when 1024 is replaced with say 1023. not sure what is the best if we want to add to a int, but that is not of my interest. I am interested in the case where a
is a double. What are the forms of constants (e.g. power of 2) that we can efficiently add and multiply to a double? The constant is arbitrary, just need to be large enough to prevent underflow.
This is probably not restricted to C and C++ only, but I do not know of a more appropriate tag.
On most modern processors, simply multiplying by a power of two (e.g., x *= 0x1p10;
to multiply by 210 or x *= 0x1p-10;
to divide by 210) will be fast and error-free (unless the result is large enough to overflow or small enough to underflow).
There are some processors with “early outs” for some floating-point operations. That is, they complete the instruction more quickly when certain bits are zero or meet other criteria. However, floating-point addition, subtraction, and multiplication commonly execute in about four CPU cycles, so they are fairly fast even without early outs. Additionally, most modern processors execute several instructions at a time, so other work proceeds while a multiplication is occurring, and they are pipelined, so, commonly, one multiplication can be started (and one finish) in each CPU cycle. (Sometimes more.)
Multiplying by powers of two has no rounding error because the significand (fraction portion of the value) does not change, so the new significand is exactly representable. (Except, multiplying by a value less than 1, bits of the significand can be pushed lower than the limit of the floating-point type, causing underflow. For the common IEEE 754 double format, this does not occur until the value is less than 0x1p-1022.)
Do not use division for scaling (or for reversing the effects of prior scaling). Instead, multiply by the inverse. (To remove a previous scaling of 0x1p57, multiply by 0x1p-57.) This is because division instructions are slow on most modern processors. E.g., 30 cycles is not unusual.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With