I have two unsorted random access arrays of a single simple element type (int/string/etc, so has all comparison operators, can be hashed, etc.). There should not be duplicate elements in either array.
Looking for a general algorthim that given these arrays A and B will tell me:
I guess I could do this with the set operators as below, but is there a faster solution (e.g. one that doesnt require me to build two sorted sets)?
r1 = std::set_intersection(a,b);
r2 = std::set_difference(a,b);
r3 = std::set_difference(b,a);
First, it's not clear from your question whether you mean
std::set when you speak of sorted sets. If so, then your
first reaction should be to use std::vector, if you can, on
the original vectors. Just sort them, and then:
std::vector<T> r1;
std::set_intersection( a.cbegin(), a.cend(), b.cbegin(), b.cend(), std::back_inserter( r1 ) );
And the same for r2 and r3.
Beyond that, I doubt that there's much you can do. Just one loop might improve things some:
std::sort( a.begin(), a.end() );
std::sort( b.begin(), b.end() );
onlyA.reserve( a.size() );
onlyB.reserve( b.size() );
both.reserve( std::min( a.size(), b.size() ) );
auto ita = a.cbegin();
auto enda = a.cend();
auto itb = b.cbegin();
auto endb = b.cend();
while ( ita != enda && itb != endb ) {
if ( *ita < *itb ) {
onlyA.push_back( *ita );
++ ita;
} else if ( *itb < *ita ) {
onlyB.push_back( *itb );
++ itb;
} else {
both.push_back( *ita );
++ ita;
++ itb;
}
}
onlyA.insert( onlyA.end(), ita, enda );
onlyB.insert( onlyB.end(), itb, endb );
The reserve could make a difference, and unless most of the
elements end up in the same vector, probably won't cost much
extra memory.
Something like the following algorithm will run O(|A|+|B|) (assuming O(1) behavior from unordered_map):
onlyA initially contain all of A, and lists onlyB and bothAB start out as empty.Amap associate elements in onlyA with its corresponding iterator in onlyA.B
bothABonlyA using aionlyBAt the end of the above algorithm,
Below is an implementation of the above. The result is returned as a tuple <onlyA, onlyB, bothAB>.
template <typename C>
auto venn_ify (const C &A, const C &B) ->
std::tuple<
std::list<typename C::value_type>,
std::list<typename C::value_type>,
std::list<typename C::value_type>
>
{
typedef typename C::value_type T;
typedef std::list<T> LIST;
LIST onlyA(A.begin(), A.end()), onlyB, bothAB;
std::unordered_map<T, typename LIST::iterator> Amap(2*A.size());
for (auto a = onlyA.begin(); a != onlyA.end(); ++a) Amap[*a] = a;
for (auto b : B) {
auto ai = Amap.find(b);
if (ai == Amap.end()) onlyB.push_back(b);
else {
bothAB.push_back(b);
onlyA.erase(ai->second);
}
}
return std::make_tuple(onlyA, onlyB, bothAB);
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With