Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google Sparsehash uses realloc() on type which is not trivially copyable

Consider this simple program:

#include <string>
#include <sparsehash/dense_hash_map>

int main()
{
    google::dense_hash_map<std::string, int> map;
    map["foo"] = 0;
}

Compiling with GCC 8.2 and -Wclass-memaccess (or -Wall) produces a warning:

sparsehash/internal/libc_allocator_with_realloc.h:68:40: warning:
‘void* realloc(void*, size_t)’ moving an object of non-trivially copyable type
    ‘struct std::pair<const std::__cxx11::basic_string<char>, int>’;
use ‘new’ and ‘delete’ instead [-Wclass-memaccess]
    return static_cast<pointer>(realloc(p, n * sizeof(value_type)));
                                ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~

The questions are:

  1. Is it undefined behavior?
  2. Can you suggest a fix or workaround which can be applied to the application code (not by changing Sparsehash or avoiding its use)?
  3. (Bonus points) can you construct a program which actually misbehaves due to this (using std::string or your own nontrivial type)? So far I haven't seen any problems in code using std::string as the key type, despite the fact that std::string must be quite a commonly used key type.

I've filed an issue here: https://github.com/sparsehash/sparsehash/issues/149

like image 354
John Zwinck Avatar asked Sep 21 '18 03:09

John Zwinck


2 Answers

  1. Yes, it is undefined behavior.
    But don't despair yet, iff std::string doesn't store any internal pointers in your implementation, nor registers them anywhere, it will "work" anyway; making a bitwise copy will be equivalent to move-construction at the destination and destructing the source.
    Such is the case for most (not all) string-implementations, SSO or not.

  2. If you might use a type not guaranteed to be trivially destructively-moveable, use a different allocator (last template-argument) to avoid bitwise-moves.

  3. Making a program blow up due to invalid moveing by bitwise copy is trivial.
    Use this type with google::dense_hash_map:

    class bang {
        bang* p;
    public:
        bang() : p(this) {}
        bang(bang const&) : bang() {}
        bang& operator=(bang const&) { return *this; }
        ~bang() { if (p != this) std::abort(); }
    };
    
like image 189
Deduplicator Avatar answered Nov 09 '22 00:11

Deduplicator


1. Is it undefined behavior? Yes. You should never copy objects using realloc(), because sometimes they have internal pointers that point to a resource. The problem comes later when 2 different objects have their destructors run. Now a double deallocation occurs of the same resource, a complete no no.

2. Can you suggest a fix or workaround which can be applied to the application code (not by changing Sparsehash or avoiding its use)?

Try

#include <memory>

and change line

google::dense_hash_map<std::string, int> map;

to

google::dense_hash_map<std::string, int, std::hash<std::string>, std::equal_to<std::string>, std::allocator> map;

Now, it won't use google's allocator libc_allocator_with_realloc

3. (Bonus points) can you construct a program which actually misbehaves due to this (using std::string or your own nontrivial type)? So far I haven't seen any problems in code using std::string as the key type, despite the fact that std::string must be quite a commonly used key type.

Not easily. Because you are trying to cause undefined behavour. In your test program, I would feed strings that are at least 32 characters in length, so the small string optimisation does not kick in. And there are tests that can be done in gcc's heap to see if it has gone corrupt. See 1

like image 34
SJHowe Avatar answered Nov 09 '22 00:11

SJHowe