Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove almost duplicates from a vector in C++

I have an std::vector of floats that I want to not contain duplicates but the math that populates the vector isn't 100% precise. The vector has values that differ by a few hundredths but should be treated as the same point. For example here's some values in one of them:

...
X: -43.094505
X: -43.094501
X: -43.094498
...

What would be the best/most efficient way to remove duplicates from a vector like this.

like image 656
Max Rahm Avatar asked Apr 22 '14 23:04

Max Rahm


2 Answers

First sort your vector using std::sort. Then use std::unique with a custom predicate to remove the duplicates.

std::unique(v.begin(), v.end(), 
            [](double l, double r) { return std::abs(l - r) < 0.01; });
// treats any numbers that differ by less than 0.01 as equal

Live demo

like image 181
Praetorian Avatar answered Nov 14 '22 00:11

Praetorian


  1. Sorting is always a good first step. Use std::sort().

  2. Remove not sufficiently unique elements: std::unique().

  3. Last step, call resize() and maybe also shrink_to_fit().

If you want to preserve the order, do the previous 3 steps on a copy (omit shrinking though).
Then use std::remove_if with a lambda, checking for existence of the element in the copy (binary search) (don't forget to remove it if found), and only retain elements if found in the copy.

like image 21
Deduplicator Avatar answered Nov 14 '22 01:11

Deduplicator