Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to make elements of vector unique? (remove non adjacent duplicates)

I have a vector containing few non-adjacent duplicates.

As a simple example, consider:

2 1 6 1 4 6 2 1 1 

I am trying to make this vector unique by removing the non-adjacent duplicates and maintaining the order of elements.

Result would be:

2 1 6 4  

The solutions I tried are:

  1. Inserting into a std::set but the problem with this approach is that it will disturb the order of elements.
  2. Use the combination of std::sort and std::unique. But again same order problem.
  3. Manual duplicate elimination:

        Define a temporary vector TempVector.     for (each element in a vector)     {         if (the element does not exists in TempVector)         {             add to TempVector;         }     }     swap orginial vector with TempVector. 

My question is:

Is there any STL algorithm which can remove the non-adjacent duplicates from the vector ? what is its complexity?

like image 253
aJ. Avatar asked Sep 21 '09 07:09

aJ.


People also ask

How do you remove a repeating element from a vector?

Using std::remove function A simple solution is to iterate the vector, and for each element, we delete all its duplicates from the vector if present. We can either write our own routine for this or use the std::remove algorithm that makes our code elegant. This approach takes constant space but runs in O(n2) time.

How do you remove a specific vector from an element?

The erase() function can remove an element from the beginning, within, or end of the vector. In order to remove all the elements from the vector, using erase(), the erase() function has to be repeated the number of times there are elements, beginning from the first element.

Are duplicates allowed in vector?

Yes, but sorting a vector modifies the original content.


2 Answers

I think you would do it like this:

I would use two iterators on the vector :

The first of one reads the data and inserts it a temporary set.

When the read data was not in the set you copy it from the first iterator to the second and increment it.

At the end you keep only the data up to the second iterator.

The complexity is O( n .log( n ) ) as the lookup for duplicated elements uses the set, not the vector.

#include <vector> #include <set> #include <iostream>  int main(int argc, char* argv[]) {     std::vector< int > k ;      k.push_back( 2 );     k.push_back( 1 );     k.push_back( 6 );     k.push_back( 1 );     k.push_back( 4 );     k.push_back( 6 );     k.push_back( 2 );     k.push_back( 1 );     k.push_back( 1 );  {     std::vector< int >::iterator r , w ;      std::set< int > tmpset ;      for( r = k.begin() , w = k.begin() ; r != k.end() ; ++r )     {         if( tmpset.insert( *r ).second )         {             *w++ = *r ;         }     }      k.erase( w , k.end() ); }       {         std::vector< int >::iterator r ;          for( r = k.begin() ; r != k.end() ; ++r )         {             std::cout << *r << std::endl ;         }     } } 
like image 105
fa. Avatar answered Sep 20 '22 22:09

fa.


Without using a temporary set it's possible to do this with (possibly) some loss of performance:

template<class Iterator> Iterator Unique(Iterator first, Iterator last) {     while (first != last)     {         Iterator next(first);         last = std::remove(++next, last, *first);         first = next;     }      return last; } 

used as in:

vec.erase( Unique( vec.begin(), vec.end() ), vec.end() ); 

For smaller data sets, the implementation simplicity and lack of extra allocation required may offset the theoretical higher complexity of using an additional set. Measurement with a representative input is the only way to be sure, though.

like image 41
CB Bailey Avatar answered Sep 20 '22 22:09

CB Bailey