Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Checking for duplicates in a vector [duplicate]

Tags:

c++

Possible Duplicate:
Determining if an unordered vector<T> has all unique elements

I have to check a vector for duplicates. What is the best way to approach this:

I take the first element, compare it against all other elements in the vector. Then take the next element and do the same and so on.

Is this the best way to do it, or is there a more efficient way to check for dups?

like image 606
xbonez Avatar asked May 18 '10 19:05

xbonez


2 Answers

If your vector is an STL container, the solution is easy:

std::sort(myvec.begin(), myvec.end());
std::erase(std::unique(myvec.begin(), myvec.end()), myvec.end());

According to cppreference (https://en.cppreference.com/w/cpp/algorithm/unique), the elements are shifted around so that the values from myvec.begin() to the return value of std::unique are all unique. The elements after the iterator returned by std::unique are unspecified (useless in every use-case I've seen) so remove them from the std::vector<A> using std::vector<A>::erase.

like image 78
Patrick Avatar answered Sep 23 '22 23:09

Patrick


Use a hash table in which you insert each element. Before you insert an element, check if it's already there. If it is, you have yourself a duplicate. This is O(n) on average, but the worst case is just as bad as your current method.

Alternatively, you can use a set to do the same thing in O(n log n) worst case. This is as good as the sorting solution, except it doesn't change the order of the elements (uses more memory though since you create a set).

Another way is to copy your vector to another vector, sort that and check the adjacent elements there. I'm not sure if this is faster than the set solution, but I think sorting adds less overhead than the balanced search trees a set uses so it should be faster in practice.

Of course, if you don't care about keeping the original order of the elements, just sort the initial vector.

like image 35
IVlad Avatar answered Sep 20 '22 23:09

IVlad