Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Looking for a data structure that's fast to initialize and fast to lookup (O(1))

I need a data structure in which I want to store information about which instances I already processed during an action. Because of limitations I can't store it in the instance itself (e.g. because I can execute the action in parallel.

What's specific is that the instances for which I want to store information have a unique number, so instead of a pointer to the instance, I could use that unique number to store information.

My first solution was to use an std::set<Instance *>. Every time i process an instance, I add it to the set so that I know that I already processed that instance.

  • Advantage: this is very fast to initialize
  • Disadvantage: lookups are not O(1), but O(logN)

My second soluction was to use an std::vector<bool> (actually std::vector<byte> because bool vectors have a specific specialization which makes it slower than a non-bool vector). The unique number of the instance can be used as index into the vector, and in the vector simply contains true or false to indicate if we already processed the instance or not (luckily my unique numbers start to count from 1).

  • Advantage: lookups are O(1)
  • Disadvantage: initialization if relatively slow, since std::vector needs to initialize every element explicitly (and probably also independently)

I could also use a C-style array (on which I can use memset), but since the number of instances (or the number of unique numbers) is not known beforehand, I need to write my own code to extend the array, memset the rest of the array, ... (which is not very hard, but which is something I want to avoid).

Is there any other kind of data structure that is very fast to initialize, and has O(1) lookup time?

like image 269
Patrick Avatar asked Dec 21 '25 04:12

Patrick


1 Answers

You may try boost::unordered_set or the new C++11 std::unordered_set. They are hashed based containers rather than trees like std::set.