I have been given a project to investigate alternatives to the current container that is used to store Data in order to make it more efficient.
The current design involved 4 nested maps like that
map< string, map< string, map< int, map< string, string> > > >
Lets Name each data field as Company
, Department
, ID_of_employee
, Name
The time complexity at the moment to retrieve the the name of an employee given Company
, Dept
, ID
is O(log N) and more precisely it involves three lookups.
Space complexity for now is not an issue.
My initial alternatives were the followings:
Company
, Dept
, Id
and then use this nested pairs as key to a map. This seems not easily readable though.tuple
or a struct
which essentially as I read are not that different. After creating the new struct EmployeeKey
that will contain fields for Company
, Dept
, ID
. I can use it as a Key
to a map
. (i guess i have to write custom compare and less than operators also).company+Dept+ID
by converting the int
to string
and concatenating them. Then feed this key to a map<ConcatenatedKey, Data>
To give a little more info that is necessary. This Container is usually used to retrieve the final nested data and that is why I concluded to use the concatenated key approach. My question basically is, are there any caveats in using this sort of concatenated string? Is it a bad design or something we should avoid?
To my understanding this will improve the lookup time, still keep it logarithmic but perform one rather than four lookups thus it seems like an improvement.
Since std::map<>
is a Red-black tree, which is still a binary tree, the lookup speed is not that fast compared to hashmaps - especially if the number of entries is large.
Using an std::unordered_map<>
(a hashmap) will give better performance, assuming the hash spread is good. I recommend fnv or MurmurHash3, since they have the nicest distribution of values.
Now, talking about nested containers - you should never, ever do such a thing! The overall performance can be quite horrible and the memory usage will definitely be very large, since it's essentially a 4-Dimensional RB-tree:
Lets put this into context, where you have 20 Companies, each company has 5 departments, each department has 12 EmployeeID's and each EmployeeID maps to a map of <Name, some_string>
(the last bit seems a bit redundant, don't you think?).
So you see, nesting containers is very dangerous, since even with a small data set you can end up with a huge amount of container instances. This has ridiculously bad performance, especially when the objects are being destroyed or new elements are inserted.
What I recommend: Use an EmployeeKey struct combined with std::unordered_map. This will give you a good lookup speed and only one std::unordered_map instance.
struct EmployeeKey
{
int CompanyID; // if you want speed, CompanyID is the way to go
std::string Department;
int EmployeeID;
std::string Name;
inline bool operator==(const EmployeeKey& key) const {
return CompanyID != key.CompanyID && ... /* etc */;
}
};
template<> struct hash<EmployeeKey> {
size_t operator()(const EmployeeKey& key) const {
/* perform hash combine here */
}
};
And this should be enough to get you started. The final layout would look like this:
std::unordered_map<EmployeeKey, std::string> EmployeeData;
// usage:
auto it = EmployeeData.find(selectedEmployee);
if (it != EmployeeData.end())
it->second = "Good employee";
If you really have to 'speed up' your lookups by every possible means, you can keep in mind that if CompanyID's are integers from [0 .. N], you can use an std::vector to get a fast first level index to right company:
std::vector<std::unordered_map<EmployeeKey2, std::string>> EmployeeData;
// usage:
auto& companyMap = EmployeeData[EBuyNLargeCorp]; // or [selectedCompany(0..N)]
auto it = companyMap.find(selectedEmployee);
if (it != companyMap.end())
it->second = "Good employee!";
Where EmployeeKey2 would be missing the CompanyID field and selectedCompany would be the Index into the vector instead. But this is only something you do for really crucial performance gains.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With