Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

std::max_element skip over NANs

Tags:

c++

arrays

std

I have a std::vector<double> that may contain several NAN values. I want to find the largest element in the vector. How can I efficiently skip the NANs in the comparison? I'd like to avoid having to call isnan on each element. any ideas?

// std::max_element([NAN,NAN,NAN,-31,-89]) = NAN 
// because NAN > -31 returns NAN.
// how can I skip all NANs in the comparison?
// test 2 below is my use case.

#include <vector>
#include <iostream>
#include <cmath>

void vector_max(std::vector<double> v, double &max, int &imax){
    std::vector<double>::iterator v_iter;
    v_iter = std::max_element(v.begin(),v.end());
    imax = std::distance(v.begin(), v_iter);
    max  = *v_iter;
}

int main(){

    std::vector<double> v_vec;
    std::vector<double>::iterator v_vec_iter;
    int imax;
    double val;

    std::cout << "test 1. " << std::endl;

    v_vec.push_back( -33.0 );
    v_vec.push_back( -124.0 );
    v_vec.push_back( -31.0 );
    v_vec.push_back( 18.4 );

    vector_max(v_vec,val,imax);
    std::cout << "max(v_vec) = " << val << std::endl;
    std::cout << "indmax(v_vec) = " << imax << std::endl;

    std::cout << "test 2: my case. " << std::endl;

    v_vec.clear();
    v_vec.push_back( NAN );
    v_vec.push_back( NAN );
    v_vec.push_back( NAN );
    v_vec.push_back( -33.0 );
    v_vec.push_back( -124.0 );
    v_vec.push_back( -31.0 );
    v_vec.push_back( 31.0 );

    vector_max(v_vec,val,imax);
    std::cout << "max(v_vec) = " << val << std::endl;
    std::cout << "indmax(v_vec) = " << imax << std::endl;

};

this returns:

test 1. 
max(v_vec) = 18.4
indmax(v_vec) = 3
test 2. 
max(v_vec) = nan
indmax(v_vec) = 0
like image 440
Florian Oswald Avatar asked Mar 09 '16 10:03

Florian Oswald


3 Answers

You can provide a custom comparison for max_element:

void vector_max(std::vector<double> v, double &max, int &imax){
    std::vector<double>::iterator v_iter;
    v_iter = std::max_element(v.begin(),v.end(),
    [] (auto x, auto y)
    {
        return x < y ? true : isnan(x);
    });
    imax = std::distance(v.begin(), v_iter);
    max  = *v_iter;
}
like image 58
tahsmith Avatar answered Nov 19 '22 19:11

tahsmith


I would try something like this:

void vector_max(std::vector<double> v, double &max, int &imax){
    std::vector<double>::size_type p=0;
    imax = -1;
    max = std::numeric_limits<double>::lowest();

    for (auto &val : v)
    {
        if (!std::isnan(val) && val>max)
        {
            imax = p;
            max = val;
        }
        p++;
    }
}
like image 20
Juan Leni Avatar answered Nov 19 '22 19:11

Juan Leni


The problem is that std::max_element uses std::less as its comparator by default. Depending on the order in which it processes the vector's elements, a NAN may appear on the right-hand side of the comparison. Since all comparisons with NANs return false, this means that NAN can appear greater than all other elements.

Put differently, when you use std::max_element with the default comparator on a vector with NANs, the result is actually undefined since it depends on the implementation, and on the order of elements. For example, on GCC, if I place all your NANs at the end of the vector, I (randomly) get the desired result.

So, you have no other option than to provide your own comparison operator:

#include <vector>
#include <iostream>
#include <cmath>
#include <algorithm>

template <typename T>
struct NaNAwareLess
{
  bool operator () (T a, T b) const
  {
    if (std::isnan(b))
    {
      return false; // Assume NaN is less than *any* non-NaN value.
    }
    if (std::isnan(a))
    {
      return true; // Assume *any* non-NaN value is greater than NaN.
    }
    return (a < b);
  }
};

void vector_max(std::vector<double> v, double &max, int &imax){
    std::vector<double>::iterator v_iter;
    v_iter = std::max_element<std::vector<double>::iterator, NaNAwareLess<double> >(v.begin(),v.end(),NaNAwareLess<double>());
    imax = std::distance(v.begin(), v_iter);
    max  = *v_iter;
}

int main(){

    std::vector<double> v_vec;
    std::vector<double>::iterator v_vec_iter;
    int imax;
    double val;

    std::cout << "test 1. " << std::endl;

    v_vec.push_back( -33.0 );
    v_vec.push_back( -124.0 );
    v_vec.push_back( -31.0 );
    v_vec.push_back( 18.4 );

    vector_max(v_vec,val,imax);
    std::cout << "max(v_vec) = " << val << std::endl;
    std::cout << "indmax(v_vec) = " << imax << std::endl;

    std::cout << "test 2: my case. " << std::endl;

    v_vec.clear();
    v_vec.push_back( NAN );
    v_vec.push_back( NAN );
    v_vec.push_back( NAN );
    v_vec.push_back( -33.0 );
    v_vec.push_back( -124.0 );
    v_vec.push_back( -31.0 );
    v_vec.push_back( 31.0 );

    vector_max(v_vec,val,imax);
    std::cout << "max(v_vec) = " << val << std::endl;
    std::cout << "indmax(v_vec) = " << imax << std::endl;
    std::cout << std::boolalpha << std::less<double>()(NAN, -33.0) << std::endl;
    std::cout << std::boolalpha << std::less<double>()(-33.0, NAN) << std::endl;
};

I don't think you can avoid calling isnan. And there's another important aspect as well: From personal experience, I found performing operations on NAN values to be a lot slower than on any other values (probably because of FPU exception handling). So, while using isnan might be annoying, it could also make quite a positive difference in performance.

like image 4
mindriot Avatar answered Nov 19 '22 19:11

mindriot