Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a faster way to remove and store an element from an unordered set

I have an unordered set like below:

[1,2,3,4,6,7,5]

I want to remove and store an element from my unordered set and I don't care which element is removed.

I am currently doing the following. Is there a faster way to do it?

auto it = set_of_ints.begin();
set_of_ints.erase(it);
.....
.....
std::cout << "removed element is: " << *it << std::endl;

I meant to paste the print statement before the erase but many answers discuss that issue. So I am leaving it as is.

like image 457
Morpheus Avatar asked Jan 30 '19 10:01

Morpheus


People also ask

How do you remove an element from an unordered set?

The unordered_set::erase() function is a built-in function in C++ STL which is used to remove either a single element or a group of elements ranging from start(inclusive) to end(exclusive). This decreases the size of a container by the number of elements removed.

Is unordered_set faster than set?

Even though unordered set is faster in the average case, set is almost always guaranteed to have lower worst-case complexities (for example insert). If you want to access the elements in order, that set sorts them. Different sets can be lexicographically compared using <,<=, >, and >=.

Is unordered set faster than vector?

vector is fast and is the best choice 99% of the time. Unless, of course, if correctness is more important than performance, then use unordered_set and switch to vector only if performance problems arise from the use of unordered_set .

What is the time complexity of unordered set?

The time complexity of set operations is O(log n) while for unordered_set, it is O(1).


2 Answers

No, the std::unordered_set::erase member function is the only function meant to be used when erasing elements from the set, and the docs say:

Complexity
Given an instance c of unordered_set:
1) Average case: constant, worst case: c.size()
[...]

So why is it c.size() in the worst case? Note that erase has a return value:

Return value
1-2) Iterator following the last removed element.
[...]

The function has to find the "next element". std::unordered_set stores its data in so called bucket lists. Ideally, this is the next available slot in the same bucket list as the one which accommodates the element which you erase. Worst case, it is the last available slot in some other bucket (and hence it scales with the size of the container). This depends on the insert/erase history of the container. You can have a look at the libcxx implementation here, there is a loop traversing the nodes in the bucket list (the mechanism is well explained by @eeroika's answer).


Besides, not that (also from the docs on erase):

References and iterators to the erased elements are invalidated

So dereferencing the iterator it after you erased it from the set is undefined behavior. You can fix it by

auto it = set_of_ints.begin();
const int value = *it;

set_ot_ints.erase(it);

std::cout << "removed element is: " << value << "\n";
like image 132
lubgr Avatar answered Nov 14 '22 14:11

lubgr


No, there is no faster way to remove an element of a set than erase. Unless your intention is to transfer the element into another set in which case extract may be faster as a whole.

The choice of element is irrelevant; except if you don't have an iterator at hand, the fastest iterator to get is begin.


In case you're wondering the case where erasure might have linear complexity: If the buckets are implemented as singly linked list (as is typical), and all elements have the same key (or the keys happen to have the same hash value) and the erased element happens to be the last in the bucket, then the entire container would need to be traversed.

The constant average assumes an even distribution of keys and a good hash function.


However, erasure invalidates the iterator, so behaviour of directing through it after erasure is undefined.

like image 45
eerorika Avatar answered Nov 14 '22 13:11

eerorika