Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

why floor, ceil implementation return x + x when x is NaN or inf?

I'm reading IEEE-754 math functions' implementation in glibc. Here is floor implementation.

float
__floorf(float x)
{
    int32_t i0,j0;
    uint32_t i;
    GET_FLOAT_WORD(i0,x);
    j0 = ((i0>>23)&0xff)-0x7f;
    if(j0<23) {
        if(j0<0) {
        /* return 0*sign(x) if |x|<1 */
        if(i0>=0) {i0=0;}
        else if((i0&0x7fffffff)!=0)
          { i0=0xbf800000;}
        } else {
        i = (0x007fffff)>>j0;
        if((i0&i)==0) return x; /* x is integral */
        if(i0<0) i0 += (0x00800000)>>j0;
        i0 &= (~i);
        }
    } else {
        if(__builtin_expect(j0==0x80, 0)) return x+x; /* inf or NaN */
        else return x;      /* x is integral */
    }
    SET_FLOAT_WORD(x,i0);
    return x;
}

Interesting part is if(__builtin_expect(j0==0x80, 0)) return x+x; /* inf or NaN */. Why does it return x+x when x is inf or NaN? Why not just return x?

EDIT

I got my code from https://github.com/lattera/glibc/blob/895ef79e04a953cac1493863bcae29ad85657ee1/sysdeps/ieee754/flt-32/s_floorf.c and assumed it is fork from glibc.

like image 482
csehydrogen Avatar asked Apr 04 '19 05:04

csehydrogen


1 Answers

The purpose is to raise exceptions. When the input to floor is a signaling NaN, the routine should raise the floating-point invalid operation exception.1 Rather than calling some routine that would do this by manipulating bits in a floating-point status register, it is easier to simply evaluate x+x, as adding a signaling NaN to itself (or anything) will raise the invalid operation exception.

This is quite common in implementations of math library routines. For another example, consider sin(x). For very small values of x, sin(x) is so near x that x is the closest value representable in the floating-point format, so the returned value should be x. But the exact mathematical sin x is not exactly x (if x is not zero), so the inexact exception should be raised. To do this, a routine may return, for example, x + x*x. When x is very small (but not zero), this will evaluate to the same as x but it will raise the invalid exception.

Note an added benefit in this case: When x is zero, x + x*x does not raise the inexact exception. Thus, the expression serves for both zero and very small non-zero cases. So it substitutes not only for manually raising an exception but also for branching based on whether x is zero or not. This is not uncommon in these expressions; they are an efficient way of implementing the function.

Footnote

1 Floating-point exceptions are not C++ exceptions. How they are handled depends on settings for the floating-point environment. Most commonly, they simply raise flags that the program can later check. But they can also cause traps that change program execution, like C++ exceptions.

like image 64
Eric Postpischil Avatar answered Nov 17 '22 19:11

Eric Postpischil