Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the difference between std::merge and std::set_union?

Tags:

c++

merge

The question is clear, my google- and cplusplus.com/reference-fu is failing me.

like image 410
rubenvb Avatar asked Mar 06 '11 16:03

rubenvb


3 Answers

std::set_union will contain those elements that are present in both sets only once. std::merge will contain them twice.

For example, with A = {1, 2, 5}; B = {2, 3, 4}:

  • union will give C = {1, 2, 3, 4, 5}
  • merge will give D = {1, 2, 2, 3, 4, 5}

Both work on sorted ranges, and return a sorted result.

Short example:

#include <algorithm>
#include <iostream>
#include <set>
#include <vector>

int main()
{
  std::set<int> A = {1, 2, 5};
  std::set<int> B = {2, 3, 4};

  std::vector<int> out;
  std::set_union(std::begin(A), std::end(A), std::begin(B), std::end(B),
                 std::back_inserter(out));
  for (auto i : out)
  {
    std::cout << i << " ";
  }
  std::cout << '\n';

  out.clear();
  std::merge(std::begin(A), std::end(A), std::begin(B), std::end(B),
             std::back_inserter(out));
  for (auto i : out)
  {
    std::cout << i << " ";
  }
  std::cout << '\n';
}

Output:

1 2 3 4 5 
1 2 2 3 4 5
like image 96
Mat Avatar answered Nov 16 '22 13:11

Mat


std::merge keeps all elements from both ranges, equivalent elements from the first range preceding equivalent elements from the second range in the output. Where an equivalent elements appear in both ranges std::set_union takes only the element from the first range, otherwise each element is merged in order as with std::merge.

References: ISO/IEC 14882:2003 25.3.4 [lib.alg.merge] and 25.3.5.2 [lib.set.union].

like image 34
CB Bailey Avatar answered Nov 16 '22 14:11

CB Bailey


This is the verification I suggested in the comment I posted to the accepted answer (i.e. that if an element is present in one of the input-sets N times, it will appear N times in the output of set_union - so set_union does not remove duplicate equivalent items in the way we would 'naturally' or 'mathematically' expect - if, however, both input-ranges contained a common item once only, then set_union would appear to remove the duplicate)

#include <vector>
#include <algorithm>
#include <iostream>
#include <cassert>

using namespace std;

void printer(int i) { cout << i << ", "; }

int main() {
    int mynumbers1[] = { 0, 1, 2, 3, 3, 4 }; // this is sorted, 3 is dupe
    int mynumbers2[] = { 5 };                // this is sorted


    vector<int> union_result(10);
    set_union(mynumbers1, mynumbers1 + sizeof(mynumbers1)/sizeof(int),
              mynumbers2, mynumbers2 + sizeof(mynumbers2)/sizeof(int),
              union_result.begin());
    for_each(union_result.begin(), union_result.end(), printer);

    return 0;
}

This will print: 0, 1, 2, 3, 3, 4, 5, 0, 0, 0,

like image 2
aho Avatar answered Nov 16 '22 14:11

aho