Why would anyone use set instead of unordered_set?

People also ask

What is the main benefit of using set over unordered_set?

Set allows to traverse elements in sorted order whereas Unordered_set doesn't allow to traverse elements in sorted order.

What is the difference between set and unordered_set?

Set is an ordered sequence of unique keys whereas unordered_set is a set in which key can be stored in any order, so unordered. Set is implemented as a balanced tree structure that is why it is possible to maintain order between the elements (by specific tree traversal).

Is set or unordered set faster?

set uses less memory than unordered_set to store the same number of elements. For a small number of elements, lookups in a set might be faster than lookups in an unordered_set .

What does unordered_set mean?

Unordered set is an associative container that contains a set of unique objects of type Key. Search, insertion, and removal have average constant-time complexity. Internally, the elements are not sorted in any particular order, but organized into buckets.

Unordered sets have to pay for their O(1) average access time in a few ways:

set uses less memory than unordered_set to store the same number of elements.
For a small number of elements, lookups in a set might be faster than lookups in an unordered_set.
Even though many operations are faster in the average case for unordered_set, they are often guaranteed to have better worst case complexities for set (for example insert).
That set sorts the elements is useful if you want to access them in order.
You can lexicographically compare different sets with <, <=, > and >=. unordered_sets are not required to support these operations.

When, for someone who wants to iterate over the items of the set, the order matters.

Whenever you prefer a tree to a hash table.

For instance, hash tables are "O(n)" at worst case. O(1) is the average case. Trees are "O(log n)" at worst.

Use set when:

We need ordered data(distinct elements).
We would have to print/access the data (in sorted order).
We need predecessor/successor of elements.

Use unordered_set when:

We need to keep a set of distinct elements and no ordering is required.
We need single element access i.e. no traversal.

Examples:

set:

Input : 1, 8, 2, 5, 3, 9

Output : 1, 2, 3, 5, 8, 9

Unordered_set:

Input : 1, 8, 2, 5, 3, 9

Output : 9 3 1 8 2 5 (maybe this order, influenced by hash function)

Mainly difference :

enter image description here

Note:(in some case set is more convenient) for example using vector as key

set<vector<int>> s;
s.insert({1, 2});
s.insert({1, 3});
s.insert({1, 2});

for(const auto& vec:s)
    cout<<vec<<endl;   // I have override << for vector
// 1 2
// 1 3

The reason why vector<int> can be as key in set because vector override operator<.

But if you use unordered_set<vector<int>> you have to create a hash function for vector<int>, because vector does't have a hash function, so you have to define one like:

struct VectorHash {
    size_t operator()(const std::vector<int>& v) const {
        std::hash<int> hasher;
        size_t seed = 0;
        for (int i : v) {
            seed ^= hasher(i) + 0x9e3779b9 + (seed<<6) + (seed>>2);
        }
        return seed;
    }
};

vector<vector<int>> two(){
    //unordered_set<vector<int>> s; // error vector<int> doesn't  have hash function
    unordered_set<vector<int>, VectorHash> s;
    s.insert({1, 2});
    s.insert({1, 3});
    s.insert({1, 2});

    for(const auto& vec:s)
        cout<<vec<<endl;
    // 1 2
    // 1 3
}

you can see that in some case unordered_set is more complicated.

Mainly cited from: https://www.geeksforgeeks.org/set-vs-unordered_set-c-stl/ https://stackoverflow.com/a/29855973/6329006

g++ 6.4 stdlibc++ ordered vs unordered set benchmark

I benchmarked this dominant Linux C++ implementation to see the difference:

enter image description here

The full benchmark details and analysis have been given at: What is the underlying data structure of a STL set in C++? and I will not repeat them here.

"BST" means "tested with std::set and "hash map" means "tested with std::unordered_set. "Heap" is for std::priority_queue which I analyzed at: Heap vs Binary Search Tree (BST)

As a quick summary:

the graph clearly shows that under these conditions, hashmap insertion were always a lot faster when there are more than 100k items, and the difference grows as the number of items increases

The cost of this speed boost is that you are not able to efficiently traverse in order.
the curves clearly suggest that ordered std::set is BST-based and std::unordered_set is hashmap based. In the reference answer, I further confirmed that by GDB step debugging the code.

Similar question for map vs unordered_map: Is there any advantage of using map over unordered_map in case of trivial keys?

Because std::set is part of Standard C++ and unordered_set isn't. C++0x is NOT a standard, and neither is Boost. For many of us, portability is essential, and that means sticking to the standard.

Consider sweepline algorithms. These algorithms would fail utterly with hash tables, but work beautifully with balanced trees. To give you a concrete example of a sweepline algorithm consider fortune's algorithm. http://en.wikipedia.org/wiki/Fortune%27s_algorithm

Related questions
                            
                                Meaning of acronym SSO in the context of std::string
                            
                                Why is volatile not considered useful in multithreaded C or C++ programming?
                            
                                std::function vs template
                            
                                Checking for NULL pointer in C/C++ [closed]
                            
                                Isn't a semicolon (';') needed after a function declaration in C++?
                            
                                Generating random integer from a range
                            
                                What happens to a detached thread when main() exits?
                            
                                Is the PIMPL idiom really used in practice?
                            
                                What does iterator->second mean?
                            
                                What does void mean in C, C++, and C#?
                            
                                Why is a boolean 1 byte and not 1 bit of size?
                            
                                Linux c++ error: undefined reference to 'dlopen'
                            
                                Why would I std::move an std::shared_ptr?
                            
                                Why should I avoid multiple inheritance in C++?
                            
                                Converting an int to std::string
                            
                                Fastest method of screen capturing on Windows
                            
                                C++ auto keyword. Why is it magic?
                            
                                std::enable_if to conditionally compile a member function
                            
                                Advantage of switch over if-else statement
                            
                                Rotating a point about another point (2D)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why would anyone use set instead of unordered_set?

Tags:

c++

algorithm

data-structures

c++11

People also ask

Recent Activity

Donate For Us