Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using string_view to search unordered_map<string, int> [duplicate]

I have an unordered_map that uses a string-type as a key:

std::unordered_map<string, value> map;

A std::hash specialization is provided for string, as well as a suitable operator==.

Now I also have a "string view" class, which is a weak pointer into an existing string, avoiding heap allocations:

class string_view {
    string *data;
    size_t begin, len;
    // ...
};  

Now I'd like to be able to check if a key exists in the map using a string_view object. Unfortunately, std::unordered_map::find takes a Key argument, not a generic T argument.

(Sure, I can "promote" one to a string, but that causes an allocation I'd like to avoid.)

What I would've liked instead was something like

template<class Key, class Value>
class unordered_map
{
    template<class T> iterator find(const T &t);
};

which would require operator==(T, Key) and std::hash<T>() to be suitably defined, and would return an iterator to a matching value.

Is there any workaround?

like image 222
peppe Avatar asked Jan 04 '16 17:01

peppe


People also ask

Does unordered_map allow duplicates?

We have discussed unordered_map in our previous post, but there is a limitation, we can not store duplicates in unordered_map, that is if we have a key-value pair already in our unordered_multimap and another pair is inserted, then both will be there whereas in case of unordered_map the previous value corresponding to ...

What is heterogeneous lookup?

To avoid this unnecessary work, some containers provide heterogeneous lookup. This feature allows callers to pass keys of any type (as long as the user-specified comparator functor supports them). See std::map::find for an example of this feature in an STL container.

Can we compare two unordered maps?

The comparison between unordered_map objects is not affected by the arbitrary order in which they store their elements. Two unordered_maps are equal if they have the same number of elements and the elements in one container are a permutation of the elements in the other container. Otherwise, they are unequal.

What does count function do in unordered_map?

The unordered_map::count() is a builtin method in C++ which is used to count the number of elements present in an unordered_map with a given key.


4 Answers

P0919R2 Heterogeneous lookup for unordered containers has been merged in the C++2a's working draft!

The abstract seems a perfect match w.r.t. my original question :-)

Abstract

This proposal adds heterogeneous lookup support to the unordered associative containers in the C++ Standard Library. As a result, a creation of a temporary key object is not needed when different (but compatible) type is provided as a key to the member function. This also makes unordered and regular associative container interfaces and functionality more compatible with each other.

With the changes proposed by this paper the following code will work without any additional performance hits:

template<typename Key, typename Value>
using h_str_umap = std::unordered_map<Key, Value, string_hash>;
h_str_umap<std::string, int> map = /* ... */;
map.find("This does not create a temporary std::string object :-)"sv);
like image 172
peppe Avatar answered Oct 14 '22 18:10

peppe


As mentioned above, C++14 does not provide heterogeneous lookup for std::unordered_map (unlike std::map). You can use Boost.MultiIndex to define a fairly close substitute for std::unordered_map that allows you to look up string_views without allocating temporary std::strings:

Live Coliru Demo

#include <boost/multi_index_container.hpp>
#include <boost/multi_index/hashed_index.hpp>
#include <boost/multi_index/member.hpp>
#include <string>

using namespace boost::multi_index;

struct string_view
{
  std::string *data;
  std::size_t begin,len;
};

template<typename T,typename Q>
struct mutable_pair
{
  T         first;
  mutable Q second;
};

struct string_view_hash
{
  std::size_t operator()(const string_view& v)const
  {
     return boost::hash_range(
       v.data->begin()+v.begin,v.data->begin()+v.begin+v.len);
  }
  std::size_t operator()(const std::string& s)const
  {
     return boost::hash_range(s.begin(),s.end());
  }
};

struct string_view_equal_to
{
  std::size_t operator()(const std::string& s1,const std::string& s2)const
  {
     return s1==s2;
  }
  std::size_t operator()(const std::string& s1,const string_view& v2)const
  {
     return s1.size()==v2.len&&
            std::equal(
              s1.begin(),s1.end(),
              v2.data->begin()+v2.begin);
  }
  std::size_t operator()(const string_view& v1,const std::string& s2)const
  {
     return v1.len==s2.size()&&
            std::equal(
              v1.data->begin()+v1.begin,v1.data->begin()+v1.begin+v1.len,
              s2.begin());
  }
};

template<typename Q>
using unordered_string_map=multi_index_container<
  mutable_pair<std::string,Q>,
  indexed_by<
    hashed_unique<
      member<
        mutable_pair<std::string,Q>,
        std::string,
        &mutable_pair<std::string,Q>::first
      >,
      string_view_hash,
      string_view_equal_to
    >
  >
>;

#include <iostream>

int main()
{
  unordered_string_map<int> m={{"hello",0},{"boost",1},{"bye",2}};

  std::string str="helloboost";
  auto it=m.find(string_view{&str,5,5});
  std::cout<<it->first<<","<<it->second<<"\n";
}

Output

boost,1
like image 42
Joaquín M López Muñoz Avatar answered Oct 14 '22 18:10

Joaquín M López Muñoz


It looks like only as recently as C++14 did even the basic map get such a templated find for is_transparent types in the comparison. Most likely the correct implementation for hashed containers was not immediately evident.

As far as I can see your two options are:

  • Just do the allocation and profile to see if maybe it's not actually a problem.
  • Take a look at boost::multi_index (http://www.boost.org/doc/libs/1_60_0/libs/multi_index/doc/index.html) and have both string and string_view indexes into the container.
like image 42
Mark B Avatar answered Oct 14 '22 16:10

Mark B


I faced an equal problem.

We need two structs:

struct string_equal {
    using is_transparent = std::true_type ;

    bool operator()(std::string_view l, std::string_view r) const noexcept
    {
        return l == r;
    }
};


struct string_hash {
    using is_transparent = std::true_type ;

    auto operator()(std::string_view str) const noexcept {
        return std::hash<std::string_view>()(str);
    }
};

For unordered_map:

template <typename Value>
using string_unorderd_map = std::unordered_map<std::string, Value, string_hash, string_equal>;

For unordered_set:

using string_unorderd_set = std::unordered_set<std::string, string_hash, string_equal>;

Now using string_view is possible.

like image 21
Nikita Avdosev Avatar answered Oct 14 '22 16:10

Nikita Avdosev