I have an array of integers stored contiguously in memory and I want to add them all to a unordered_set
collection.
Right now, I am adding them one at a time.
for (int i = 0; i < count; i++)
collection.insert(pi[i]);
Is there any way to do this more efficiently?
I realize that the items are not stored contiguously within the collection, so it won't be as simply as just handing the array over to the collection. But can this be optimized somehow?
For a small number of elements, lookups in a set might be faster than lookups in an unordered_set . Even though many operations are faster in the average case for unordered_set , they are often guaranteed to have better worst case complexities for set (for example insert ).
Instead, std::unordered_set can often have approximately 10% faster than std::vector for n≥4 in my experiments. And the std::set is always the worst in that range.
Unordered sets do not allow duplicates and are initialized using comma-delimited values enclosed in curly braces.
The time complexity of set operations is O(log n) while for unordered_set, it is O(1).
unordered_set
has a constructor that takes a range of elements to initially add them:
template< class InputIt >
unordered_set( InputIt first, InputIt last,
size_type bucket_count = /*implementation-defined*/,
const Hash& hash = Hash(),
const key_equal& equal = key_equal(),
const Allocator& alloc = Allocator() );
So you can just do collection = std::unordered_set{ p, p + count };
and leave it up to implementation.
As other user pointed out in comments, there's also an overload for insert
that takes a range:
template< class InputIt >
void insert( InputIt first, InputIt last );
So, just like calling the constructor, you can do, collection.insert(p, p + count);
There's no guarantee that this overloads will be more efficient, since the complexity is linear in both overloads on average, as well as for just inserting elements one by one.
In fact, if we look into how insert
is implemented in MSVC, it's very simple
template<class _Iter>
void insert(_Iter _First, _Iter _Last)
{ // insert [_First, _Last) at front, then put in place
_DEBUG_RANGE(_First, _Last);
for (; _First != _Last; ++_First)
emplace(*_First);
}
so no optimization for this case.
I think, the best way to go about this is to call reserve
, if you know a number of elements you are going to add and, if there are many collisions (which there won't be for integers), maybe modifying bucket_count
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With