Keep the duplicated values only - Vectors C++

Question

Assume I have a vector with the following elements {1, 1, 2, 3, 3, 4} I want to write a program with c++ code to remove the unique values and keep only the duplicated once. So the end result will be something like this {1,3}.

So far this is what I've done, but it takes a lot of time, Is there any way this can be more efficient,

vector <int> g1 = {1,1,2,3,3,4}
vector <int> g2;

for(int i = 0; i < g1.size(); i++)
{
  if(count(g1.begin(), g1.end(), g1[i]) > 1)
    g2.push_back(g1[i]);

}

v.erase(std::unique(g2.begin(), g2.end()), g2.end());

for(int i = 0; i < g2.size(); i++)
{
  cout << g2[i];
}

BoBTFish · Accepted Answer

My approach is to create an <algorithm>-style template, and use an unordered_map to do the counting. This means you only iterate over the input list once, and the time complexity is O(n). It does use O(n) extra memory though, and isn't particularly cache-friendly. Also this does assume that the type in the input is hashable.

#include <algorithm>
#include <iostream>
#include <iterator>
#include <unordered_map>

template <typename InputIt, typename OutputIt>
OutputIt copy_duplicates(
        InputIt  first,
        InputIt  last,
        OutputIt d_first)
{
    std::unordered_map<typename std::iterator_traits<InputIt>::value_type,
                       std::size_t> seen;
    for ( ; first != last; ++first) {
        if ( 2 == ++seen[*first] ) {
            // only output on the second time of seeing a value
            *d_first = *first;
            ++d_first;
        }
    }
    return d_first;
}

int main()
{
    int i[] = {1, 2, 3, 1, 1, 3, 5}; // print 1, 3,
    //                  ^     ^
    copy_duplicates(std::begin(i), std::end(i),
                    std::ostream_iterator<int>(std::cout, ", "));
}

This can output to any kind of iterator. There are special iterators you can use that when written to will insert the value into a container.

Omnifarious · Answer

Here's a way that's a little more cache friendly than unordered_map answer, but is O(n log n) instead of O(n), though it does not use any extra memory and does no allocations. Additionally, the overall multiplier is probably higher, in spite of it's cache friendliness.

#include <vector>
#include <algorithm>

void only_distinct_duplicates(::std::vector<int> &v)
{
    ::std::sort(v.begin(), v.end());
    auto output = v.begin();
    auto test = v.begin();
    auto run_start = v.begin();
    auto const end = v.end();
    for (auto test = v.begin(); test != end; ++test) {
       if (*test == *run_start) {
           if ((test - run_start) == 1) {
              *output = *run_start;
              ++output;
           }
       } else {
           run_start = test;
       }
    }
    v.erase(output, end);
}

I've tested this, and it works. If you want a generic version that should work on any type that vector can store:

template <typename T>
void only_distinct_duplicates(::std::vector<T> &v)
{
    ::std::sort(v.begin(), v.end());
    auto output = v.begin();
    auto test = v.begin();
    auto run_start = v.begin();
    auto const end = v.end();
    for (auto test = v.begin(); test != end; ++test) {
       if (*test != *run_start) {
           if ((test - run_start) > 1) {
              ::std::swap(*output, *run_start);
              ++output;
           }
           run_start = test;
       }
    }
    if ((end - run_start) > 1) {
        ::std::swap(*output, *run_start);
        ++output;
    }
    v.erase(output, end);
}

Keep the duplicated values only - Vectors C++

Tags:

c++

vector

Mohamed Abdel Aziz

2 Answers

BoBTFish

Omnifarious

Recent Activity

Donate For Us

Keep the duplicated values only - Vectors C++

Tags:

c++

vector

Mohamed Abdel Aziz

2 Answers

BoBTFish

Omnifarious

Related questions

Recent Activity

Donate For Us