I have a vector containing few non-adjacent duplicates.
As a simple example, consider:
2 1 6 1 4 6 2 1 1
I am trying to make this vector
unique by removing the non-adjacent duplicates and maintaining the order of elements.
Result would be:
2 1 6 4
The solutions I tried are:
Manual duplicate elimination:
Define a temporary vector TempVector. for (each element in a vector) { if (the element does not exists in TempVector) { add to TempVector; } } swap orginial vector with TempVector.
My question is:
Is there any STL algorithm which can remove the non-adjacent duplicates from the vector ? what is its complexity?
Using std::remove function A simple solution is to iterate the vector, and for each element, we delete all its duplicates from the vector if present. We can either write our own routine for this or use the std::remove algorithm that makes our code elegant. This approach takes constant space but runs in O(n2) time.
The erase() function can remove an element from the beginning, within, or end of the vector. In order to remove all the elements from the vector, using erase(), the erase() function has to be repeated the number of times there are elements, beginning from the first element.
Yes, but sorting a vector modifies the original content.
I think you would do it like this:
I would use two iterators on the vector :
The first of one reads the data and inserts it a temporary set.
When the read data was not in the set you copy it from the first iterator to the second and increment it.
At the end you keep only the data up to the second iterator.
The complexity is O( n .log( n ) ) as the lookup for duplicated elements uses the set, not the vector.
#include <vector> #include <set> #include <iostream> int main(int argc, char* argv[]) { std::vector< int > k ; k.push_back( 2 ); k.push_back( 1 ); k.push_back( 6 ); k.push_back( 1 ); k.push_back( 4 ); k.push_back( 6 ); k.push_back( 2 ); k.push_back( 1 ); k.push_back( 1 ); { std::vector< int >::iterator r , w ; std::set< int > tmpset ; for( r = k.begin() , w = k.begin() ; r != k.end() ; ++r ) { if( tmpset.insert( *r ).second ) { *w++ = *r ; } } k.erase( w , k.end() ); } { std::vector< int >::iterator r ; for( r = k.begin() ; r != k.end() ; ++r ) { std::cout << *r << std::endl ; } } }
Without using a temporary set
it's possible to do this with (possibly) some loss of performance:
template<class Iterator> Iterator Unique(Iterator first, Iterator last) { while (first != last) { Iterator next(first); last = std::remove(++next, last, *first); first = next; } return last; }
used as in:
vec.erase( Unique( vec.begin(), vec.end() ), vec.end() );
For smaller data sets, the implementation simplicity and lack of extra allocation required may offset the theoretical higher complexity of using an additional set
. Measurement with a representative input is the only way to be sure, though.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With