Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

std::map get value - find vs handcrafted loop [closed]

I have a std::map object map<string , Property*> _propertyMap, where string is the property's name and Property* contains the property values.

I need to process the properties values and convert them to a specific data format- each property has its own format, e.g.. if the map initialization is as following:

_propertyMap["id"]   = new Property(Property::UUID, "12345678");
_propertyMap["name"] = new Property(Property::STRING, "name");
....

then "id" should be processed differently than "name" etc.

This means that I need to look for each property in the map and process its values accordingly.

I thought about two ways to do that.

One, use std::map::find method to get a specific property, like that:

map<string , Property*>::iterator it1 = _propertyMap.find("id");
if(it1 != _propertyMap.end())
{
   //element found - process id values
} 
map<string , Property*>::iterator it2 = _propertyMap.find("name");
if(it2 != _propertyMap.end())
{
   //element found - process name values
} 
....

Two, iterate the map and for each entry check what the property's name is and proceed accordingly:

for (it = _propertyMap.begin(); it != _propertyMap.end(); ++it )
    {
        //if it is events - append the values to the matching nodes
        if (it->first == "id")
        {
           //process id values
        }
        else if (it->first == "name")
                {
                    //process name values
                }
        .....
    }

Given that the Time complexity of std::map::find is O(logN), the complexity of the first solution is O(NlogN). I'm not sure about the complexity of the second solution, because it iterates the map once (O(N)), but performs a lot of if-else each iteration. I tried to google common map::find() questions, but couldn't find any useful information; most of them just need to get one value from the map, and then find() does this with better complexity (O(logN) vs O(N)).

What is a better approach? or perhaps there is another one which I didn't think of?

Also, code styling speaking, which one is more good and clear code?

like image 893
user3114639 Avatar asked Mar 20 '16 16:03

user3114639


3 Answers

I see a few different use-cases here, depending on what you have in mind:

Fixed properties

(Just for completeness, i guess it is not what you want) If both name and type of possible properties should be fixed, the best version is to use a simple class/struct, possibly using boost::optional (std::optional with C++17) for values that might be present or not

struct Data{
    int id = 0;
    std::string name = "";
    boost::optional<int> whatever = boost::none;
}

Pros:

  • All "lookups" are resolved at compile-time

Cons:

  • No flexibility to expand at runtime

Process only specific options depending on their key

If you want to process only a specific subset of options, but keep the option to have (unprocessed) custom keys your approaches seem suitable.

In this case remember that using find like this:

it1 = _propertyMap.find("id");

has complexity O(logN) but is used M times, with M beeing the number of processed options. This is not the size of your map, it is the number of times you use find() to get a specific property. In your (shortened) example this means a complexity of O(2 * logN), since you only look for 2 keys.

So basically using M-times find() scales better than looping when only the size of the map increases, but worse if you increase the number of finds in the same manner. But only profiling can tell you which one is the faster for your size and use case.

Process all options depending on type

Since your map looks a lot like the keys can be custom but the types are from a small subset, consider looping over the map and using the types instead of the names to determine how to process them. Something like this:

for (it = _propertyMap.begin(); it != _propertyMap.end(); ++it )
{
    if (it->first.type() == Property::UUID)
    {
       //process UUID values
    }
    else if (it->first.type() == Property::STRING)
    {
        //process STRING values
    }
    .....
}

This has the advantage, that you do not need any information about what the keys of your map really are, only what types it is able to store.

like image 173
Anedar Avatar answered Nov 10 '22 10:11

Anedar


Suppose we have a map of N properties, and we are looking for a subset of P properties. Here a rough analysis, not knowing the statistical distribution of the keys:

  • In the pure map approach you search P times with a complexity of O(log(n)), that is O(p*log(n))

  • In the chained-if approach you are going to traverse once the map. That's O(N). But you should not forget that an if-then chain is also a (hiden) traversal of list of P elements. So for every of the N elements you are doing a search of potentially up to P elements. So that you have here a complexity of O(p*n).

This means that the map approach will outperform your traversal, and the performance gap will increase significantly with n. Of course this doesn't take into account function call overhead in map that you don't have in the if-chain. So that if P and N are small, your approach could still stand the theoretical comparison.

What you could eventually do to increase peformance further would be to use an unordered_map, which is O(1) in complexity, reducing your problem complexity to O(P).

like image 1
Christophe Avatar answered Nov 10 '22 12:11

Christophe


There is another option which combines the best of both. Given a function like this (which is an adaptation of std::set_intersection):

template<class InputIt1, class InputIt2, 
         class Function, class Compare>
void match(InputIt1 first1, InputIt1 last1,
           InputIt2 first2, InputIt2 last2,
           Function f, Compare comp)
{   
    while (first1 != last1 && first2 != last2) {
        if (comp(*first1,*first2)) {
            ++first1;
        } else {
            if (!comp(*first2,*first1)) {
                f(*first1++,*first2);
            }
            ++first2;
        }
    }
}

You can use it to process all your properties in O(N+M) time. Here is an example:

#include <map>
#include <string>
#include <functional>
#include <cassert>

using std::map;
using std::string;
using std::function;

struct Property {
  enum Type { UUID, STRING };
  Type type;
  string value;
};

int main()
{
    map<string,Property> properties;
    map<string,function<void(Property&)>> processors;

    properties["id"] = Property{Property::UUID,"12345678"};
    properties["name"] = Property{Property::STRING,"name"};

    bool id_found = false;
    bool name_found = false;

    processors["id"]   = [&](Property&){ id_found = true; };
    processors["name"] = [&](Property&){ name_found = true; };

    match(
       properties.begin(),properties.end(),
       processors.begin(),processors.end(),
       [](auto &a,auto &b){ b.second(a.second); },
       [](auto &a,auto &b) { return a.first < b.first; }
    );

    assert(id_found && name_found);
}

The processors map can be built separately and reused to reduce the overhead.

like image 1
Vaughn Cato Avatar answered Nov 10 '22 11:11

Vaughn Cato