As a canonical example, consider the problem of argument-reduction for trigonometric functions, as in computing x mod 2π as a first step for computing sin(x). This kind of problem is difficult in that you can't just use fmod
, because y (2π in the example) is not representible.
I came up with a simple solution that works for arbitrary values y, not just 2π, and I'm curious how it compares (in performance) with typical argument-reduction algorithms.
The basic idea is to store a table containing the value of 2n mod y for each value n in the range log2(y) to the maximum possible floating point exponent, then using the linearity of modular arithmetic, sum the values in this table over bits that are set in the value of x. It amounts to N branches and at most N additions, where N is the number of mantissa bits in your floating point type. The result is not necessarily less than y, but it's bounded by N*y, and the procedure can be applied again to give a result that's bounded by log2(N)*y or fmod
can simply be used at this point with minimal error.
Can this be improved? And do the typical trigonometric argument reduction algorithms work for arbitrary y or only for 2π?
State of the art implementations of the trigonometric functions in mathematical libraries work correctly across the entire input domain. They do so by representing some constant related to π, such as 2/π, to sufficient accuracy for the floating-point format used.
For example, for a trig function reduction in IEEE double precision one needs to represent the constant to about 1150 bits, to be used in what essentially is fixed-point computation. This approach was pioneered by the authors of the following paper:
M. Payne and R. Hanek. Radian reduction for trigonometric functions. SIGNUM Newsletter, 18:19–24, 1983
The original idea has since then been modified and refined by other authors; both floating-point-based and integer-based variants are possible. The FDLIBM library provides a fully worked example here:
http://www.netlib.org/fdlibm/e_rem_pio2.c
The following paper by the author of FDLIBM describes the approach used in this code
http://www.validlab.com/arg.pdf K.C. Ng. Argument Reduction for Huge Arguments: Good to the Last Bit
Note that it is not necessary to carry the intermediate computation to 1150 bits. Since in a reduction, the leading bits cancel the computation only needs to involve a much smaller group of bits inside the full constant. Due to the need for multi-precision arithmetic this is still a fairly expensive operation.
For trigonometric function argument reduction on more tightly restricted ranges other, more economical, schemes are possible, especially when the hardware supports FMA (fused multiply-add).
The techniques used for trigonometric argument reduction seem extensible to reductions by arbitrary high-precision constants.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With