Is there a convenient way to remove items from a vector (or another stl container) whose values are within an certain interval?
So for example: I got an vector with float values
1.1 1.3 2.2 3.2 4.1 5.2 5.1 1.1 8.0 2.1
and a delta of 0.2, which should lead to the following result
1.1 2.2 3.2 4.1 5.1 8.0
thus removing all "duplicate" items within the delta and keep one of the values within the range. It can be assumed that the values are "clustered", where the difference between these is more than 3*delta. Only one value (the mean) of the cluster should be kept, all the others from the cluster should be removed.
For sure, it is possible to iterate with nested loops, but this seems quite complicated, because of the changing iterators, so I thought of a more convenient way. I found remove_if for example, but this function cannot "compare".
Thanks for suggestions.
You can use std::unique
with a predicate:
template <typename It, typename Predicate>
It unique(It first, It last, Predicate pred);
The most used form of std::unique
takes no predicate and just removes duplicates from a sequence. But you can write one that implements your filter (in your case, compares two values using a gap) and you are set. Something like:
bool CompareWithGap(double a, double b)
{
return abs(a - b) <= 0.2;
}
And use it to call std::unique
:
auto it = std::unique(v.begin(), v.end(), CompareWithGap);
Where v
is your vector (or any other sequence).
EDIT: Forgot to mention that you need to sort your sequence before using std::unique
. If this is not an option, you have to write your own algorithm.
2nd EDIT: For completion, as pointed out by Christian Rau in his comment, you can now erase the items removed from your sequence by using the erase
method in it:
v.erase(it, v.end());
You question is actually quite complicated because 'one of the values within the range' is not well defined. For instance given
1.1 1.2 1.3
which do you want to keep? Presumably the first one, i.e. 1.1. OK now how about
0.9 1.1 1.3
Following the first one rule, we keep 0.9 and 1.3, But we could have kept just 1.1 instead. The question I think is are 0.9 and 1.3 'duplicates' or not? I don't think you've defined this well enough.
What about this case, 1.1 1.2 1.3 1.4 1.5 1.6? All values are within 0.2 of one other value but not all are within 0.2 of every other value. So are they all duplicates? Or do you need to split them up? If so how should they be split up, pick 1.1 and 1.4 perhaps?
So I think until you define your problem a bit more precisely it's not really possible for you to write the code or for us to help you.
You might want to look at the disjoint-set data structure. Depending on precisely what you are trying to do it may be the most efficient way to solve this problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With