I need a data structure just like a map but each key may have multiple values related to it but I need to get all the values corresponding to a single key as an array of the objects. So which data structure would be the best to do this. I don't need to search in the data structure, i just need fast access to all values corresponding to a particular key. I have looked up at std::multimap but it doesn't return all the values for a particular key. So which might the best data structure in C++ which I may use?
I need a data structure just like a map but...
std::map<key, std::vector<value>>
80 million points is a good whack - worth considering other options. Ones worth a little thought/experimentation/benchmarking include:
sparse direct indexing... to achieve that, you need enough memory for not just the 80 million data points, but the entire x/y/z space that they span, but can then do an [x][y][z]
lookup to find the vector of cell ids - that's obviously going to be huge - whether it's doable or desirable is not clear from your problem description
a sorted vector... depending on the order/overlap of your data structure element insertion and lookups, and whether you can afford a std::map
to std::vector
compaction step - you could sort a std::vector
of (x,y,z) values then have binary_search
outperform std::map
due to the contiguous memory usage of the vector
std::unordered_map<key, std::vector<value>>
... presizing for say 100 million bucket capacity should speed insertion a bit. This could be slower or faster than other options... there's likely less pages of memory for the index than for sparse indexing, but more than a binary_search
on contiguous memory, the least # memory pages visited per lookup, but with normal hash techiniques you'll be hitting effectively random (but repeatable) hash buckets even if the x,y,z coordinates only differ a little so cache hits could be worse than all other options above.
Actual benchmarks are always the best way to tune, preferably with a profile to confirm costs are for expected reasons.
The answer by @TonyD is certainly fine, but there are some tradeoffs compared to
std::multimap<key, value>
Searching for all values for a given key should give you the same O(log N)
complexity
auto result = my_multimap.equal_range(my_key);
Iteration is still of O(N)
complexity:
for (auto it = result.first; it != result.second; ++it)
// bla
However, in all real world std::multimap
implementations, the above iteration is doing node-based pointer chasing over the "consecutive" value elements rather than the contiguous iteration that you get for a std::vector
based std::map
. This might matter for reasons of cache-locality.
The main drawback that I can see from the std::vector
solution is that you are committing to keeping all values together which might impose some overhead, depending on how often you are copying your data around.
The multimap
approach makes it easier to also insert/extract a single value from the container
my_multimap.insert(std::make_pair(some_key, another_value);
versus
auto it = my_map.find(some_key);
if (it != my_map.end())
it->second.push_back(another_value);
else
my_map.insert(std::make_pair(some_key, another_value));
You probably should benchmark your program to see which container is more convenient.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With