Possible Duplicate:
Determining if an unordered vector<T> has all unique elements
I have to check a vector for duplicates. What is the best way to approach this:
I take the first element, compare it against all other elements in the vector. Then take the next element and do the same and so on.
Is this the best way to do it, or is there a more efficient way to check for dups?
If your vector is an STL container, the solution is easy:
std::sort(myvec.begin(), myvec.end());
std::erase(std::unique(myvec.begin(), myvec.end()), myvec.end());
According to cppreference (https://en.cppreference.com/w/cpp/algorithm/unique), the elements are shifted around so that the values from myvec.begin()
to the return value of std::unique
are all unique. The elements after the iterator returned by std::unique
are unspecified (useless in every use-case I've seen) so remove them from the std::vector<A>
using std::vector<A>::erase
.
Use a hash table in which you insert each element. Before you insert an element, check if it's already there. If it is, you have yourself a duplicate. This is O(n)
on average, but the worst case is just as bad as your current method.
Alternatively, you can use a set to do the same thing in O(n log n)
worst case. This is as good as the sorting solution, except it doesn't change the order of the elements (uses more memory though since you create a set).
Another way is to copy your vector to another vector, sort that and check the adjacent elements there. I'm not sure if this is faster than the set solution, but I think sorting adds less overhead than the balanced search trees a set uses so it should be faster in practice.
Of course, if you don't care about keeping the original order of the elements, just sort the initial vector.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With