For the simple and efficient implementation of fast math functions with reasonable accuracy, polynomial minimax approximations are often the method of choice. Minimax approximations are typically generated with a variant of the Remez algorithm. Various widely available tools such as Maple and Mathematica have built-in functionality for this. The generated coefficients are typically computed using high-precision arithmetic. It is well-known that simply rounding those coefficients to machine precision leads to suboptimal accuracy in the resulting implementation.
Instead, one searches for closely related sets of coefficients that are exactly representable as machine numbers to generate a machine-optimized approximation. Two relevant papers are:
Nicolas Brisebarre, Jean-Michel Muller, and Arnaud Tisserand, "Computing Machine-Efficient Polynomial Approximations", ACM Transactions on Mathematical Software, Vol. 32, No. 2, June 2006, pp. 236–256.
Nicolas Brisebarre and Sylvain Chevillard, "Efficient polynomial L∞-approximations", 18th IEEE Symposium on Computer Arithmetic (ARITH-18), Montpellier (France), June 2007, pp. 169-176.
An implementation of the LLL-algorithm from the latter paper is available as the fpminimax()
command of the Sollya tool. It is my understanding that all algorithms proposed for the generation of machine-optimized approximations are based on heuristics, and that it is therefore generally unknown what accuracy can be achieved by an optimal approximation. It is not clear to me whether the availability of FMA (fused multiply-add) for the evaluation of the approximation has an influence on the answer to that question. It seems to me naively that it should.
I am currently looking at a simple polynomial approximation for arctangent on [-1,1] that is evaluated in IEEE-754 single-precision arithmetic, using the Horner scheme and FMA. See function atan_poly()
in the C99 code below. For lack of access to a Linux machine at the moment, I did not use Sollya to generate these coefficients, but used my own heuristic that could be loosely described as a mixture of steepest decent and simulated annealing (to avoid getting stuck on local minima). The maximum error of my machine-optimized polynomial is very close to 1 ulp, but ideally I would like the maximum ulp error to be below 1 ulp.
I am aware that I could change my computation to increase the accuracy, for example by using a leading coefficient represented to more than single-precision precision, but I would like to keep the code exactly as is (that is, as simple as possible) adjusting only the coefficients to deliver the most accurate result possible.
A "proven" optimal set of coefficients would be ideal, pointers to relevant literature are welcome. I did a literature search but could not find any paper that advances the state of the art meaningfully beyond Sollya's fpminimax()
, and none that examine the role of FMA (if any) in this issue.
// max ulp err = 1.03143
float atan_poly (float a)
{
float r, s;
s = a * a;
r = 0x1.7ed1ccp-9f;
r = fmaf (r, s, -0x1.0c2c08p-6f);
r = fmaf (r, s, 0x1.61fdd0p-5f);
r = fmaf (r, s, -0x1.3556b2p-4f);
r = fmaf (r, s, 0x1.b4e128p-4f);
r = fmaf (r, s, -0x1.230ad2p-3f);
r = fmaf (r, s, 0x1.9978ecp-3f);
r = fmaf (r, s, -0x1.5554dcp-2f);
r = r * s;
r = fmaf (r, a, a);
return r;
}
// max ulp err = 1.52637
float my_atanf (float a)
{
float r, t;
t = fabsf (a);
r = t;
if (t > 1.0f) {
r = 1.0f / r;
}
r = atan_poly (r);
if (t > 1.0f) {
r = fmaf (0x1.ddcb02p-1f, 0x1.aee9d6p+0f, -r); // pi/2 - r
}
r = copysignf (r, a);
return r;
}
The following function is a faithfully-rounded implementation of arctan
on [0, 1]
:
float atan_poly (float a) {
float s = a * a, u = fmaf(a, -a, 0x1.fde90cp-1f);
float r1 = 0x1.74dfb6p-9f;
float r2 = fmaf (r1, u, 0x1.3a1c7cp-8f);
float r3 = fmaf (r2, s, -0x1.7f24b6p-7f);
float r4 = fmaf (r3, u, -0x1.eb3900p-7f);
float r5 = fmaf (r4, s, 0x1.1ab95ap-5f);
float r6 = fmaf (r5, u, 0x1.80e87cp-5f);
float r7 = fmaf (r6, s, -0x1.e71aa4p-4f);
float r8 = fmaf (r7, u, -0x1.b81b44p-3f);
float r9 = r8 * s;
float r10 = fmaf (r9, a, a);
return r10;
}
The following test harness will abort if the function atan_poly
fails to be faithfully-rounded on [1e-16, 1]
and print "success" otherwise:
int checkit(float f) {
double d = atan(f);
float d1 = d, d2 = d;
if (d1 < d) d2 = nextafterf(d1, 1.0/0.0);
else d1 = nextafterf(d1, -1.0/0.0);
float p = atan_poly(f);
if (p != d1 && p != d2) return 0;
return 1;
}
int main() {
for (float f = 1; f > 1e-16; f = nextafterf(f, -1.0/0.0)) {
if (!checkit(f)) abort();
}
printf("success\n");
exit(0);
}
The problem with using s
in every multiplication is that the polynomial's coefficients do not decay rapidly. Inputs close to 1 result in lots and lots of cancellation of nearly equal numbers, meaning you're trying to find a set of coefficients so that the accumulated roundoff at the end of the computation closely approximates the residual of arctan
.
The constant 0x1.fde90cp-1f
is a number close to 1 for which (arctan(sqrt(x)) - x) / x^3
is very close to the nearest float. That is, it's a constant that goes into the computation of u
so that the cubic coefficient is almost completely determined. (For this program, the cubic coefficient must be either -0x1.b81b44p-3f
or -0x1.b81b42p-3f
.)
Alternating multiplications by s
and u
has the effect of reducing the effect of roundoff error in ri
upon r{i+2}
by a factor of at most 1/4, since s*u < 1/4
whatever a
is. This gives considerable leeway in choosing the coefficients of fifth order and beyond.
I found the coefficients with the aid of two programs:
a
, one can compute the range of r8
that lead to a faithfully-rounded result. To get linear inequalities, I pretended r8
would be computed as a polynomial in the float
s s
and u
in real-number arithmetic; the linear inequalities constrained this real-number r8
to lie in some interval. I used the Parma Polyhedra Library to handle these constraint systems.float
s from 1
to 1e-8
in descending order and checking that atan_poly
produces a faithful rounding of atan((double)x)
. If some x
failed, it printed out that x
and why it failed.To get coefficients, I hacked this first program to fix c3
, work out bounds on r7
for each test point, then get bounds on the higher-order coefficients. Then I hacked it to fix c3
and c5
and get bounds on the higher-order coefficients. I did this until I had all but the three highest-order coefficients, c13
, c15
, and c17
.
I grew the set of test points in the second program until it either stopped printing anything out or printed out "success". I needed surprisingly few test points to reject almost all wrong polynomials---I count 85 test points in the program.
Here I show some of my work selecting the coefficients. In order to get a faithfully-rounded arctan
for my initial set of test points assuming r1
through r8
are evaluated in real arithmetic (and rounded somehow unpleasantly but in a way I can't remember) but r9
and r10
are evaluated in float
arithmetic, I need:
-0x1.b81b456625f15p-3 <= c3 <= -0x1.b81b416e22329p-3
-0x1.e71d48d9c2ca4p-4 <= c5 <= -0x1.e71783472f5d1p-4
0x1.80e063cb210f9p-5 <= c7 <= 0x1.80ed6efa0a369p-5
0x1.1a3925ea0c5a9p-5 <= c9 <= 0x1.1b3783f148ed8p-5
-0x1.ec6032f293143p-7 <= c11 <= -0x1.e928025d508p-7
-0x1.8c06e851e2255p-7 <= c13 <= -0x1.732b2d4677028p-7
0x1.2aff33d629371p-8 <= c15 <= 0x1.41e9bc01ae472p-8
0x1.1e22f3192fd1dp-9 <= c17 <= 0x1.d851520a087c2p-9
Taking c3 = -0x1.b81b44p-3, assuming r8
is also evaluated in float
arithmetic:
-0x1.e71df05b5ad56p-4 <= c5 <= -0x1.e7175823ce2a4p-4
0x1.80df529dd8b18p-5 <= c7 <= 0x1.80f00e8da7f58p-5
0x1.1a283503e1a97p-5 <= c9 <= 0x1.1b5ca5beeeefep-5
-0x1.ed2c7cd87f889p-7 <= c11 <= -0x1.e8c17789776cdp-7
-0x1.90759e6defc62p-7 <= c13 <= -0x1.7045e66924732p-7
0x1.27eb51edf324p-8 <= c15 <= 0x1.47cda0bb1f365p-8
0x1.f6c6b51c50b54p-10 <= c17 <= 0x1.003a00ace9a79p-8
Taking c5 = -0x1.e71aa4p-4, assuming r7
is done in float
arithmetic:
0x1.80e3dcc972cb3p-5 <= c7 <= 0x1.80ed1cf56977fp-5
0x1.1aa005ff6a6f4p-5 <= c9 <= 0x1.1afce9904742p-5
-0x1.ec7cf2464a893p-7 <= c11 <= -0x1.e9d6f7039db61p-7
-0x1.8a2304daefa26p-7 <= c13 <= -0x1.7a2456ddec8b2p-7
0x1.2e7b48f595544p-8 <= c15 <= 0x1.44437896b7049p-8
0x1.396f76c06de2ep-9 <= c17 <= 0x1.e3bedf4ed606dp-9
Taking c7 = 0x1.80e87cp-5, assuming r6
is done in float
arithmetic:
0x1.1aa86d25bb64fp-5 <= c9 <= 0x1.1aca48cd5caabp-5
-0x1.eb6311f6c29dcp-7 <= c11 <= -0x1.eaedb032dfc0cp-7
-0x1.81438f115cbbp-7 <= c13 <= -0x1.7c9a106629f06p-7
0x1.36d433f81a012p-8 <= c15 <= 0x1.3babb57bb55bap-8
0x1.5cb14e1d4247dp-9 <= c17 <= 0x1.84f1151303aedp-9
Taking c9 = 0x1.1ab95ap-5, assuming r5
is done in float
arithmetic:
-0x1.eb51a3b03781dp-7 <= c11 <= -0x1.eb21431536e0dp-7
-0x1.7fcd84700f7cfp-7 <= c13 <= -0x1.7ee38ee4beb65p-7
0x1.390fa00abaaabp-8 <= c15 <= 0x1.3b100a7f5d3cep-8
0x1.6ff147e1fdeb4p-9 <= c17 <= 0x1.7ebfed3ab5f9bp-9
I picked a point close to the middle of the range for c11
and randomly chose c13
, c15
, and c17
.
EDIT: I've now automated this procedure. The following function is also a faithfully-rounded implementation of arctan
on [0, 1]
:
float c5 = 0x1.997a72p-3;
float c7 = -0x1.23176cp-3;
float c9 = 0x1.b523c8p-4;
float c11 = -0x1.358ff8p-4;
float c13 = 0x1.61c5c2p-5;
float c15 = -0x1.0b16e2p-6;
float c17 = 0x1.7b422p-9;
float juffa_poly (float a) {
float s = a * a;
float r1 = c17;
float r2 = fmaf (r1, s, c15);
float r3 = fmaf (r2, s, c13);
float r4 = fmaf (r3, s, c11);
float r5 = fmaf (r4, s, c9);
float r6 = fmaf (r5, s, c7);
float r7 = fmaf (r6, s, c5);
float r8 = fmaf (r7, s, -0x1.5554dap-2f);
float r9 = r8 * s;
float r10 = fmaf (r9, a, a);
return r10;
}
I find it surprising that this code even exists. For coefficients near these, you can get a bound on the distance between r10
and the value of the polynomial evaluated in real arithmetic on the order of a few ulps thanks to the slow convergence of this polynomial when s
is near 1
. I had expected roundoff error to behave in a way that was fundamentally "untamable" simply by means of tweaking coefficients.
I pondered the various ideas I received in comments and also ran a few experiments based on that feedback. In the end I decided that a refined heuristic search was the best way forward. I have now managed to reduce the maximum error for atanf_poly()
to 1.01036 ulps, with just three arguments exceeding my stated goal of a 1 ulp error bound:
ulp = -1.00829 @ |a| = 9.80738342e-001 0x1.f62356p-1 (3f7b11ab)
ulp = -1.01036 @ |a| = 9.87551928e-001 0x1.f9a068p-1 (3f7cd034)
ulp = 1.00050 @ |a| = 9.99375939e-001 0x1.ffae34p-1 (3f7fd71a)
Based on the manner of generating the improved approximation there is no guarantee that this is a best approximation; no scientific breakthrough here. As the ulp error of the current solution is not yet perfectly balanced, and since continuing the search continues to deliver better approximations (albeit at exponentially increasing time intervals) my guess is that a 1 ulp error bound is achievable, but at the same time we seem to be very close to the best machine-optimized approximation already.
The better quality of the new approximation is the result of a refined search process. I observed that all of the largest ulp errors in the polynomial occur close to unity, say in [0.75,1.0] to be conservative. This allows for a fast scan for interesting coefficient sets whose maximum error is smaller than some bound, say 1.08 ulps. I can then test in detail and exhaustively all coefficient sets within a heuristically chosen hyper-cone anchored at that point. This second step searches for minimum ulp error as the primary goal, and maximum percentage of correctly rounded results as a secondary objective. By using this two-step process across all four cores of my CPU I was able to significantly speed up the search process: I have been able to check about 221 coefficient sets so far.
Based on the range of each coefficient across all "close" solutions I now estimate that the total useful search space for this approximation problem is >= 224 coefficient sets rather than the more optimistic number of 220 I threw out before. This seems like a feasible problem to solve for someone who is either very patient or has lots of computational horse-power at their disposal.
My updated code is as follows:
// max ulp err = 1.01036
float atanf_poly (float a)
{
float r, s;
s = a * a;
r = 0x1.7ed22cp-9f;
r = fmaf (r, s, -0x1.0c2c2ep-6f);
r = fmaf (r, s, 0x1.61fdf6p-5f);
r = fmaf (r, s, -0x1.3556b4p-4f);
r = fmaf (r, s, 0x1.b4e12ep-4f);
r = fmaf (r, s, -0x1.230ae0p-3f);
r = fmaf (r, s, 0x1.9978eep-3f);
r = fmaf (r, s, -0x1.5554dap-2f);
r = r * s;
r = fmaf (r, a, a);
return r;
}
// max ulp err = 1.51871
float my_atanf (float a)
{
float r, t;
t = fabsf (a);
r = t;
if (t > 1.0f) {
r = 1.0f / r;
}
r = atanf_poly (r);
if (t > 1.0f) {
r = fmaf (0x1.ddcb02p-1f, 0x1.aee9d6p+0f, -r); // pi/2 - r
}
r = copysignf (r, a);
return r;
}
Update (after revisiting the issue two-and-a-half years later)
Using T. Myklebust's draft publication as a starting point, I found the arctangent approximation on [-1,1] that has the smallest error to have a maximum error of 0.94528 ulp.
/* Based on: Tor Myklebust, "Computing accurate Horner form approximations
to special functions in finite precision arithmetic", arXiv:1508.03211,
August 2015. maximum ulp err = 0.94528
*/
float atanf_poly (float a)
{
float r, s;
s = a * a;
r = 0x1.6d2086p-9f; // 2.78569828e-3
r = fmaf (r, s, -0x1.03f2ecp-6f); // -1.58660226e-2
r = fmaf (r, s, 0x1.5beebap-5f); // 4.24722321e-2
r = fmaf (r, s, -0x1.33194ep-4f); // -7.49753043e-2
r = fmaf (r, s, 0x1.b403a8p-4f); // 1.06448799e-1
r = fmaf (r, s, -0x1.22f5c2p-3f); // -1.42070308e-1
r = fmaf (r, s, 0x1.997748p-3f); // 1.99934542e-1
r = fmaf (r, s, -0x1.5554d8p-2f); // -3.33331466e-1
r = r * s;
r = fmaf (r, a, a);
return r;
}
This is not an answer to the question, but is too long to fit in a comment:
your question is about the optimal choice of coefficients C3, C5, …, C17 in a polynomial approximation to arctangent where you have pinned C1 to 1 and C2, C4, …, C16 to 0.
The title of your question says you are looking for approximations on [-1, 1], and a good reason to pin the even coefficients to 0 is that it is sufficient and necessary for the approximation to be exactly an odd function. The code in your question “contradicts” the title by applying the polynomial approximation only on [0, 1].
If you use the Remez algorithm to look for coefficients C2, C3, …, C8 to a polynomial approximation of arctangent on [0, 1] instead, you may end up with something like the values below:
#include <stdio.h>
#include <math.h>
float atan_poly (float a)
{
float r, s;
s = a;
// s = a * a;
r = -3.3507930064626076153585890630056286726807491543578e-2;
r = fmaf (r, s, 1.3859776280052980081098065189344699108643282883702e-1);
r = fmaf (r, s, -1.8186361916440430105127602496688553126414578766147e-1);
r = fmaf (r, s, -1.4583047494913656326643327729704639191810926020847e-2);
r = fmaf (r, s, 2.1335202878219865228365738728594741358740655881373e-1);
r = fmaf (r, s, -3.6801711826027841250774413728610805847547996647342e-3);
r = fmaf (r, s, -3.3289852243978319173749528028057608377028846413080e-1);
r = fmaf (r, s, -1.8631479933914856903459844359251948006605218562283e-5);
r = fmaf (r, s, 1.2917291732886065585264586294461539492689296797761e-7);
r = fmaf (r, a, a);
return r;
}
int main() {
for (float x = 0.0f; x < 1.0f; x+=0.1f)
printf("x: %f\n%a\n%a\n\n", x, atan_poly(x), atan(x));
}
This has roughly the same complexity as the code in your question—the number of multiplications is similar. Looking at this polynomial, there is no reason in particular to want to pin any coefficient to 0. If we wanted to approximate an odd function over [-1, 1] without pinning the even coefficients, they would automatically come up very small and subject to absorption, and then we would want to pin them to 0, but for this approximation over [0, 1], they don't, so we don't have to pin them.
It could have been better or worse than the odd polynomial in your question. It turns out that it is worse (see below). This quick-and-dirty application of LolRemez 0.2 (code included at the bottom of the question) seems to be, however, good enough to raise the question of the choice of coefficients. I would in particular be curious what happens if you subject the coefficients in this answer to the same “mixture of steepest decent and simulated annealing” optimization step that you applied to get the coefficients in your question.
So, to summarize this remark-posted-as-an-answer, are you sure that you are looking for optimal coefficients C3, C5, …, C17? It seems to me that you are looking for the best sequence of single-precision floating-point operations that produce a faithful approximation to arctangent, and that this approximation does not have to be the Horner form of a degree 17 odd polynomial.
x: 0.000000 0x0p+0 0x0p+0 x: 0.100000 0x1.983e2cp-4 0x1.983e28938f9ecp-4 x: 0.200000 0x1.94442p-3 0x1.94441ff1e8882p-3 x: 0.300000 0x1.2a73a6p-2 0x1.2a73a71dcec16p-2 x: 0.400000 0x1.85a37ap-2 0x1.85a3770ebe7aep-2 x: 0.500000 0x1.dac67p-2 0x1.dac670561bb5p-2 x: 0.600000 0x1.14b1dcp-1 0x1.14b1ddf627649p-1 x: 0.700000 0x1.38b116p-1 0x1.38b113eaa384ep-1 x: 0.800000 0x1.5977a8p-1 0x1.5977a686e0ffbp-1 x: 0.900000 0x1.773388p-1 0x1.77338c44f8faep-1
This is the code that I linked to LolRemez 0.2 in order to optimize the relative accuracy of a degree-9 polynomial approximation of arctangent on [0, 1]:
#include "lol/math/real.h"
#include "lol/math/remez.h"
using lol::real;
using lol::RemezSolver;
real f(real const &y)
{
return (atan(y) - y) / y;
}
real g(real const &y)
{
return re (atan(y) / y);
}
int main(int argc, char **argv)
{
RemezSolver<8, real> solver;
solver.Run("1e-1000", 1.0, f, g, 50);
return 0;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With