I have a std::vector<double>
that may contain several NAN
values. I want to find the largest element in the vector. How can I efficiently skip the NAN
s in the comparison? I'd like to avoid having to call isnan
on each element. any ideas?
// std::max_element([NAN,NAN,NAN,-31,-89]) = NAN
// because NAN > -31 returns NAN.
// how can I skip all NANs in the comparison?
// test 2 below is my use case.
#include <vector>
#include <iostream>
#include <cmath>
void vector_max(std::vector<double> v, double &max, int &imax){
std::vector<double>::iterator v_iter;
v_iter = std::max_element(v.begin(),v.end());
imax = std::distance(v.begin(), v_iter);
max = *v_iter;
}
int main(){
std::vector<double> v_vec;
std::vector<double>::iterator v_vec_iter;
int imax;
double val;
std::cout << "test 1. " << std::endl;
v_vec.push_back( -33.0 );
v_vec.push_back( -124.0 );
v_vec.push_back( -31.0 );
v_vec.push_back( 18.4 );
vector_max(v_vec,val,imax);
std::cout << "max(v_vec) = " << val << std::endl;
std::cout << "indmax(v_vec) = " << imax << std::endl;
std::cout << "test 2: my case. " << std::endl;
v_vec.clear();
v_vec.push_back( NAN );
v_vec.push_back( NAN );
v_vec.push_back( NAN );
v_vec.push_back( -33.0 );
v_vec.push_back( -124.0 );
v_vec.push_back( -31.0 );
v_vec.push_back( 31.0 );
vector_max(v_vec,val,imax);
std::cout << "max(v_vec) = " << val << std::endl;
std::cout << "indmax(v_vec) = " << imax << std::endl;
};
this returns:
test 1.
max(v_vec) = 18.4
indmax(v_vec) = 3
test 2.
max(v_vec) = nan
indmax(v_vec) = 0
You can provide a custom comparison for max_element
:
void vector_max(std::vector<double> v, double &max, int &imax){
std::vector<double>::iterator v_iter;
v_iter = std::max_element(v.begin(),v.end(),
[] (auto x, auto y)
{
return x < y ? true : isnan(x);
});
imax = std::distance(v.begin(), v_iter);
max = *v_iter;
}
I would try something like this:
void vector_max(std::vector<double> v, double &max, int &imax){
std::vector<double>::size_type p=0;
imax = -1;
max = std::numeric_limits<double>::lowest();
for (auto &val : v)
{
if (!std::isnan(val) && val>max)
{
imax = p;
max = val;
}
p++;
}
}
The problem is that std::max_element
uses std::less
as its comparator by default. Depending on the order in which it processes the vector's elements, a NAN
may appear on the right-hand side of the comparison. Since all comparisons with NAN
s return false
, this means that NAN
can appear greater than all other elements.
Put differently, when you use std::max_element
with the default comparator on a vector with NAN
s, the result is actually undefined since it depends on the implementation, and on the order of elements. For example, on GCC, if I place all your NAN
s at the end of the vector, I (randomly) get the desired result.
So, you have no other option than to provide your own comparison operator:
#include <vector>
#include <iostream>
#include <cmath>
#include <algorithm>
template <typename T>
struct NaNAwareLess
{
bool operator () (T a, T b) const
{
if (std::isnan(b))
{
return false; // Assume NaN is less than *any* non-NaN value.
}
if (std::isnan(a))
{
return true; // Assume *any* non-NaN value is greater than NaN.
}
return (a < b);
}
};
void vector_max(std::vector<double> v, double &max, int &imax){
std::vector<double>::iterator v_iter;
v_iter = std::max_element<std::vector<double>::iterator, NaNAwareLess<double> >(v.begin(),v.end(),NaNAwareLess<double>());
imax = std::distance(v.begin(), v_iter);
max = *v_iter;
}
int main(){
std::vector<double> v_vec;
std::vector<double>::iterator v_vec_iter;
int imax;
double val;
std::cout << "test 1. " << std::endl;
v_vec.push_back( -33.0 );
v_vec.push_back( -124.0 );
v_vec.push_back( -31.0 );
v_vec.push_back( 18.4 );
vector_max(v_vec,val,imax);
std::cout << "max(v_vec) = " << val << std::endl;
std::cout << "indmax(v_vec) = " << imax << std::endl;
std::cout << "test 2: my case. " << std::endl;
v_vec.clear();
v_vec.push_back( NAN );
v_vec.push_back( NAN );
v_vec.push_back( NAN );
v_vec.push_back( -33.0 );
v_vec.push_back( -124.0 );
v_vec.push_back( -31.0 );
v_vec.push_back( 31.0 );
vector_max(v_vec,val,imax);
std::cout << "max(v_vec) = " << val << std::endl;
std::cout << "indmax(v_vec) = " << imax << std::endl;
std::cout << std::boolalpha << std::less<double>()(NAN, -33.0) << std::endl;
std::cout << std::boolalpha << std::less<double>()(-33.0, NAN) << std::endl;
};
I don't think you can avoid calling isnan
. And there's another important aspect as well: From personal experience, I found performing operations on NAN
values to be a lot slower than on any other values (probably because of FPU exception handling). So, while using isnan
might be annoying, it could also make quite a positive difference in performance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With