Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

std::map for small sparse collections

Given a struct MyData, of which many instances exist (I'd say several millions at most), for each instance I need to store a member which may contain values for up to 8 keys. The key will always be an int ranged 0-7, and the values will always be a 3D point of floats (let's call it Point3).

At maximum, it would contain:

Key | Value  
-------------
0   | [x,y,z]  
1   | [x,y,z]  
2   | [x,y,z]
3   | [x,y,z]
4   | [x,y,z]
5   | [x,y,z]
6   | [x,y,z]
7   | [x,y,z]

However, in 99.9% of cases it will contain 0 or 1 key-value pairs, e.g.:

Key | Value
-------------
1   | [x,y,z]  

How can I efficiently determine the memory overhead, if any, of storing an empty or single-valued std::map<int, Point3>, compared to always storing an array of 8 Point3 (4 bytes per float * 3 values * 8 slots = 96 bytes) and a single BYTE with bits for which slots contain meaningful values?

In generality, my question is what is the memory overhead of an empty or almost empty std::map?

like image 708
Rotem Avatar asked May 27 '15 08:05

Rotem


2 Answers

The memory overhead of a map isn't that bad. It's typically a few words per node. Using a map to start with would definitely be OK under the "no premature optimization" rule.

That said, when you do optimize, the map will be high on the list of data structures to replace. But at that point, you can profile all the different operations you actually use. How often do the keys and/or values change? That's crucial information to know before optimizing.

[edit] If I were to suggest a structure, it would be a vector of std::pair<int, Point3D>. The reason is that this probably gives alignment-friendly 16 byte objects. I wouldn't bother with sorting the keys, because that's only useful for the 0.1% nodes that do have multiple key/value pairs.

like image 167
MSalters Avatar answered Sep 19 '22 16:09

MSalters


Have a look at this blog post. There's a very thorough analysis of memory usage of different STL containers (including std::map) and different platforms/compilers are taken into account.

In the 64 bit STL that comes with Visual Studio 2010 (Visual C++ 10):


map uses 32 bytes for the map object itself, then each map node is 26 bytes bigger than the size of the contained object (likely 32 bytes after taking alignment and wastage into account). The number of map nodes required for a map is 1 more than the size of the map.

like image 31
banach-space Avatar answered Sep 20 '22 16:09

banach-space