Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Avoid temporaries in std::map/std::unordered_map lookup with std::string key [duplicate]

Tags:

c++

stl

Consider the following code:

std::map<std::string, int> m1;
auto i = m1.find("foo");

const char* key = ...
auto j = m1.find(key);

This will create a temporary std::string object for every map lookup. What are the canonical ways to avoid it?

like image 842
Alex B Avatar asked Jan 24 '12 04:01

Alex B


3 Answers

Don't use pointers; instead, pass strings directly. Then you can take advantage of references:

void do_something(std::string const & key)
{
    auto it = m.find(key);
    // ....
}

C++ typically becomes "more correct" the more you use its idioms and don't try to write C with it.

like image 119
Kerrek SB Avatar answered Oct 21 '22 19:10

Kerrek SB


You can avoid the temporary by giving the std::map a custom comparator class, that can compare char *s. (The default will use the pointer's address, which isn't what you want. You need to compare on the string's value.)

Thus, something like:

class StrCmp
{
public:
  bool operator () (const char *a, const char *b)
  {
    return strcmp(a, b) < 0;
  }
};

// Later:
std::map<const char *, int, StrCmp> m;

Then, use like a normal map, but pass char *'s. Keep in mind that anything you store in the map must remain alive for the duration of the map. That means you need char literals, or you have to keep the data pointed to by the pointer alive on your own. For these reasons, I'd go with a std::map<std::string> and eat the temporary until profiling showed that the above was really needed.

like image 27
Thanatos Avatar answered Oct 21 '22 21:10

Thanatos


There is no way to avoid a temporary std::string instance that copies character data. Note that this cost is very low and does not incur dynamic memory allocation if your standard library implementation uses short string optimizations.

However, if you need to proxy C-style strings on a frequent basis, you can still come up with custom solutions that will by-pass this allocation. This can pay off if you have to do this really often and your strings are lengthy enough not to benefit from short string optimizations.

If you only need a very small subset of string functionality (e.g. only assignment and copies), then you can write a small special-purpose string class that stores a const char * pointer and a function to release the memory.

 class cheap_string
 {
 public:
     typedef void(*Free)(const char*);
 private:
     const char * myData;
     std::size_t mySize;
     Free myFree;
 public:
     // direct member assignments, use with care.
     cheap_string ( const char * data, std::size_t size, Free free );

     // releases using custom deleter (a no-op for proxies).
     ~cheap_string ();

     // create real copies (safety first).
     cheap_string ( const cheap_string& ); 
     cheap_string& operator= ( const cheap_string& ); 
     cheap_string ( const char * data );
     cheap_string ( const char * data, std::size_t size )
         : myData(new char[size+1]), mySize(size), myFree(&destroy)
     {
         strcpy(myData, data);
         myData[mySize] = '\0';
     }

     const char * data () const;
     const std::size_t size () const;

     // whatever string functionality you need.
     bool operator< ( const cheap_string& ) const;
     bool operator== ( const cheap_string& ) const;

     // create proxies for existing character buffers.
     static const cheap_string proxy ( const char * data )
     {
          return cheap_string(data, strlen(data), &abandon);
     }

     static const cheap_string proxy ( const char * data, std::size_t size )
     {
          return cheap_string(data, size, &abandon);
     }

 private:
     // deleter for proxies (no-op)
     static void abandon ( const char * data )
     {
         // no-op, this is used for proxies, which don't own the data!
     }

     // deleter for copies (delete[]).
     static void destroy ( const char * data )
     {
         delete [] data;
     }
 };

Then, you can use this class as:

 std::map<cheap_string, int> m1;
 auto i = m1.find(cheap_string::proxy("foo"));

The temporary cheap_string instance does not create a copy of the character buffer like std::string does, yet it preserves safe copy semantics for storing instances of cheap_string in standard containers.

notes: if your implementation does not use return value optimization, you'll want to find an alternate syntax for the proxy method, such as a constructor with a special overload (taking a custom proxy_t type à la std::nothrow for placement new).

like image 1
André Caron Avatar answered Oct 21 '22 20:10

André Caron