Can you share your thoughts on what the best STL data structure would be for storing a large list of names and perform searches on these names?
Edit: The names are not unique and the list can grow as new names can continuously added to it. And by large I am talking of from 1 million to 10 million names.
Since you want to search names, you want a structure that support fast random access. That means vector, deque and list are all out of the question. Also, vector/array are slow on random adds/inserts for sorted sets because they have to shift items to make room for each inserted item. Adding to end is very fast, though.
Consider std::map
, std::unordered_map
or std::unordered_multimap
(or their siblings std::set
, std::unordered_set
and std::unordered_multiset
if you are only storing keys).
If you are purely going to do unique, random access, I'd start with one of the unordered_* containers.
If you need to store an ordered list of names, and need to do range searches/iteration and sorted operations, a tree based container like std::map
or std::set
should do better with the iteration operation than a hash based container because the former will store items adjacent to their logical predecessors and successors. For random access, it is O(log N) which is still decent.
Prior to std::unordered_*, I used std::map
to hold large numbers of objects for an object cache and though there are faster random access containers, it scaled well enough for our uses. The newer unordered_map has O(1) access time so it is a hashed structure and should give you the near best access times.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With