Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to compare long doubles with qsort and with regard to NaN?

Tags:

c

qsort

How to compare long doubles with qsort() and with regard to not-a-number?

When sorting an array that might contain not-a-numbers, I would like to put all the those NAN to one end of the sorted array.


qsort() imposes some restriction on the compare function.

The function shall return an integer less than, equal to, or greater than zero if the first argument is considered to be respectively less than, equal to, or greater than the second.
C11dr §7.22.5.2 3

When the same objects ... are passed more than once to the comparison function, the results shall be consistent with one another. That is, for qsort they shall define a total ordering on the array, ... the same object shall always compare the same way with the key.
§7.22.5 4

a > b is false when a <= b or if a is not-a-number or if b is not-a-number. So a > b is not the same as !(a <= b) as they have opposite results if one of them is NaN.

If the compare function uses return (a > b) - (a < b);, code would return 0 if one or both a or b are NaN. The array would not sort as desired and it loses the total ordering requirement.

The long double aspect of this sort is important when using the classify functions like int isnan(real-floating x); or int isfinite(real-floating x);. I know isfinite( finite_long_double_more_than_DBL_MAX) might return false. So I have concerns about what isnan(some_long_double) might do something unexpected.


I tried the below. It apparently sorts as desired.

Sub-question: Is compare() below sufficient to sort as desired? Any recommended simplifications? If not - how to fix? (For this task, it is OK for values like 0.0L and -0.0L to sort in any way)

#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <float.h>

int compare(const void *a, const void *b) {
  const long double *fa = (const long double *) a;
  const long double *fb = (const long double *) b;
  if (*fa > *fb) return 1;
  if (*fa < *fb) return -1;

  if (*fa == *fb) {
    //return -memcmp(fa, fb, sizeof *fa); if -0.0, 0.0 order important.
    return 0;
  }
  // At least one of *fa or *fb is NaN
  // is *fa a non-NaN?
  if (!isnan(*fa)) return -1;
  if (!isnan(*fb)) return 1;

  // both NaN
  return 0;
  // return -memcmp(fa, fb, tbd size); if NaN order important.
}

int main(void) {
  long double x[] = { 0.0L / 0.0, 0.0L / 0.0, 0.0, 1.0L / 0.0, -0.0, LDBL_MIN,
      LDBL_MAX, 42.0, -1.0L / 0.0, 867-5309, -0.0 };
  x[0] = -x[0];
  printf("unsorted: ");
  size_t n = sizeof x / sizeof x[0];
  for (size_t i = 0; i < n; i++) {
    printf("%.3Le,", x[i]);
  }
  printf("\nsorted: ");
  qsort(x, n, sizeof x[0], compare);
  for (size_t i = 0; i < n; i++) {
    printf("%.3Le,", x[i]);
  }
  puts("");
}

Output

unsorted: nan,-nan,0.000e+00,inf,-0.000e+00,3.362e-4932,1.190e+4932,4.200e+01,-inf,-4.442e+03,-0.000e+00,
sorted: -inf,-4.442e+03,-0.000e+00,0.000e+00,-0.000e+00,3.362e-4932,4.200e+01,1.190e+4932,inf,nan,-nan,

If I knew the compare function was correct, I'd post on Code Review for improvement ideas. Yet I am not confident enough that code works correctly with those pesky NaNs.

like image 685
chux - Reinstate Monica Avatar asked Jan 02 '18 23:01

chux - Reinstate Monica


2 Answers

This is just a simple reordering of your tests, but it makes the status of NaN more clear if you will.

int compare(const void *a, const void *b)
{
    const long double fa = *(const long double *) a;
    const long double fb = *(const long double *) b;

    if (isnan(fa))
    {
        if (isnan(fb))
        {
            return 0;
        }
        return 1;
    }
    if (isnan(fb))
    {
        return -1;
    }
    if (fa > fb) return 1;
    if (fa < fb) return -1;

    /* no more comparisons needed */
    return 0;
}

As the tests for NaN are at the top and no NaNs should pass through, the bottom three lines can safely be replaced with your

return (a > b) - (a < b);

Apart from the discussion of the different types of NaN (a bit sounding like how many angels can dance on a CPU core), this ought to be stable enough for your purposes, and I can't see any possible issues with this code.

With Clang, neither -ffast-math nor -fdenormal-fp-math=[ieee|preserve-sign|positive-zero] yields other results. Nor did gcc with -ffast-math, -funsafe-math-optimizations, and even -ffinite-math-only (the latter most likely because there are no operations other than a straight compare to NaN).

Just to be complete, I tested with both std::numeric_limits<double>::signaling_NaN(); and std::numeric_limits<double>::quiet_NaN(); (from C++ <limits.h>) as well – again, no difference in the sort order.

like image 71
Jongware Avatar answered Oct 11 '22 22:10

Jongware


The NaN test

int isnan(real-floating x);

The isnan macro determines whether its argument value is a NaN. First, an argument represented in a format wider than its semantic type is converted to its semantic type. Then determination is based on the type of the argument.235
235 For the isnan macro, the type for determination does not matter unless the implementation supports NaNs in the evaluation type but not in the semantic type.

isnan(some_long_double) will work as hoped except on a rare platform.

int isunordered(real-floating x, real-floating y) acts like isnan() expect it accounts for both arguments.

On many platforms, code could use (a == a) as a candidate NaN test as that evaluates to 0 when a is NaN and 1 otherwise. Unfortunately, unless an implementation defines __STDC_IEC_559__, that is not certain to work.


The compare
>=, >, <, <= and C11 7.12.14 Comparison macros

Using >=, >, <, <= when at least one operand is a NaN can result in a "invalid" floating-point exception. So prior testing for NaN is prudent as answered by @usr2564301

C offers macros isgreaterequal(), isgreaterequal(), isless(), islessthna() which do the compare and not raise the "invalid" floating-point exception. This is good alternative for double, yet the macros use a real-floating which may differ from long double. isgreater(long_double_a, long_double_a) may evaluate as double and not provide the desired compare result.

The challenge with classify macros is that the semantic type may be narrower than long double.


The following uses the above ideas and as I read the C spec is well defined and functionally correct for all cases except the rare one: when long double has NaN but not the real-floating (often double) does not.

#include <math.h>

// compare 2 long double.  All NaN are greater than numbers.
int compare(const void *a, const void *b) {
  const long double *fa = (const long double *) a;
  const long double *fb = (const long double *) b;

  if (!isunordered(*fa, *fb)) {
    return (*fa > *fb) - (*fa < *fb);
  }

  if (!isnan(*fa)) {
    return -1;
  }
  return isnan(*fb);  // return 0 or 1
}

Note: After reading many of the good comments and learning a great deal, I am posting this self-answer as provided in Can I answer my own question? in addition to accepting another answer.

like image 31
chux - Reinstate Monica Avatar answered Oct 11 '22 23:10

chux - Reinstate Monica