Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

rint not present in Visual Studio 2010 math.h and equivalent of CUDA rint

I'm porting a CUDA code to C++ and using Visual Studio 2010. The CUDA code uses the rint function, which does not seem to be present in the Visual Studio 2010 math.h, so it seems that I need to implement it by myself.

According to this link, the CUDA rint function

rounds x to the nearest integer value in floating-point format, with halfway cases rounded towards zero.

I think I could use the casting to int which discards the fractional part, effectively rounding towards zero, so I ended-up with the following function

inline double rint(double x)
{
    int temp; temp = (x >= 0. ? (int)(x + 0.5) : (int)(x - 0.5));
    return (double)temp;
}

which has two different castings, one to int and one to double.

I have three questions:

  1. Is the above function fully equivalent to CUDA rint for "small" numbers? Will it fail for "large" numbers that cannot be represented as an int?
  2. Is there any more computationlly efficient way (rather than using two castings) of defining rint?

Thank you very much in advance.

like image 643
Vitality Avatar asked Dec 20 '22 09:12

Vitality


1 Answers

The cited description of rint() in the CUDA documentation is incorrect. Roundings to integer with floating-point result map the IEEE-754 (2008) specified rounding modes as follows:

trunc()   // round towards zero
floor()   // round down (towards negative infinity)
ceil()    // round up (towards positive infinity)
rint()    // round to nearest or even (i.e. ties are rounded to even)
round()   // round to nearest, ties away from zero

Generally, these functions work as described in the C99 standard. For rint(), the standard specifies that the function rounds according to the current rounding mode (which defaults to round to nearest or even). Since CUDA does not support dynamic rounding modes, all functions that are defined to use the current rounding mode use the rounding mode "round to nearest or even". Here are some examples showing the difference between round() and rint():

argument  rint()  round()
1.5       2.0     2.0
2.5       2.0     3.0
3.5       4.0     4.0
4.5       4.0     5.0

round() can be emulated fairly easily along the lines of the code you posted, I am not aware of a simple emulation for rint(). Please note that you would not want to use an intermediate cast to integer, as 'int' supports a narrower numeric range than integers that are exactly representable by a 'double'. Instead use trunc(), ceil(), floor() as appropriate.

Since rint() is part of both the current C and C++ standards, I am a bit surprised that MSVC does not include this function; I would suggest checking MSDN to see whether a substitute is offered. If your platforms are SSE4 capable, you could use the SSE intrinsics _mm_round_sd(), _mm_round_pd() defined in smmintrin.h, with the rounding mode set to _MM_FROUND_TO_NEAREST_INT, to implement the functionality of CUDA's rint().

While (in my experience), the SSE intrinsics are portable across Windows, Linux, and Mac OS X, you may want to avoid hardware specific code. In this case, you could try the following code (lightly tested):

double my_rint(double a)
{
    const double two_to_52 = 4.5035996273704960e+15;
    double fa = fabs(a);
    double r = two_to_52 + fa;
    if (fa >= two_to_52) {
        r = a;
    } else {
        r = r - two_to_52;
        r = _copysign(r, a);
    }
    return r;
}

Note that MSVC 2010 seems to lack the standard copysign() function as well, so I had to substitute _copysign(). The above code assumes that the current rounding mode is round-to-nearest-even (which it is by default). By adding 2**52 it makes sure that rounding occurs at the integer unit bit. Note that this also assumes that pure double-precision computation is performed. On platforms that use some higher precision for intermediate results one might need to declare 'fa' and 'r' as volatile.

like image 175
njuffa Avatar answered Dec 24 '22 02:12

njuffa