Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Computing x mod y where y is not representable as floating point

As a canonical example, consider the problem of argument-reduction for trigonometric functions, as in computing x mod 2π as a first step for computing sin(x). This kind of problem is difficult in that you can't just use fmod, because y (2π in the example) is not representible.

I came up with a simple solution that works for arbitrary values y, not just 2π, and I'm curious how it compares (in performance) with typical argument-reduction algorithms.

The basic idea is to store a table containing the value of 2n mod y for each value n in the range log2(y) to the maximum possible floating point exponent, then using the linearity of modular arithmetic, sum the values in this table over bits that are set in the value of x. It amounts to N branches and at most N additions, where N is the number of mantissa bits in your floating point type. The result is not necessarily less than y, but it's bounded by N*y, and the procedure can be applied again to give a result that's bounded by log2(N)*y or fmod can simply be used at this point with minimal error.

Can this be improved? And do the typical trigonometric argument reduction algorithms work for arbitrary y or only for 2π?

like image 409
R.. GitHub STOP HELPING ICE Avatar asked Jun 25 '11 04:06

R.. GitHub STOP HELPING ICE


1 Answers

State of the art implementations of the trigonometric functions in mathematical libraries work correctly across the entire input domain. They do so by representing some constant related to π, such as 2/π, to sufficient accuracy for the floating-point format used.

For example, for a trig function reduction in IEEE double precision one needs to represent the constant to about 1150 bits, to be used in what essentially is fixed-point computation. This approach was pioneered by the authors of the following paper:

M. Payne and R. Hanek. Radian reduction for trigonometric functions. SIGNUM Newsletter, 18:19–24, 1983

The original idea has since then been modified and refined by other authors; both floating-point-based and integer-based variants are possible. The FDLIBM library provides a fully worked example here:

http://www.netlib.org/fdlibm/e_rem_pio2.c

The following paper by the author of FDLIBM describes the approach used in this code

http://www.validlab.com/arg.pdf K.C. Ng. Argument Reduction for Huge Arguments: Good to the Last Bit

Note that it is not necessary to carry the intermediate computation to 1150 bits. Since in a reduction, the leading bits cancel the computation only needs to involve a much smaller group of bits inside the full constant. Due to the need for multi-precision arithmetic this is still a fairly expensive operation.

For trigonometric function argument reduction on more tightly restricted ranges other, more economical, schemes are possible, especially when the hardware supports FMA (fused multiply-add).

The techniques used for trigonometric argument reduction seem extensible to reductions by arbitrary high-precision constants.

like image 191
njuffa Avatar answered Sep 20 '22 17:09

njuffa