While writing some code targeting C++17, I kind of hit a stumbling block determining the exception-safety of an operation merging two compatible std::unordered_maps. Per the current working draft, §26.2.7, table 91 reads, in part, regarding the conditions of <code>a.merge( a2 )</code>: <blockquote> Requires: <code>a.get_allocator() == a2.get_allocator()</code>. Attempts to extract each element in <code>a2</code> and insert it into <code>a</code> using the hash function and key equality predicate of <code>a</code>. In containers with unique keys, if there is an element in <code>a</code> with key equivalent to the key of an element from <code>a2</code>, then that element is not extracted from <code>a2</code>. Postconditions: Pointers and references to the transferred elements of <code>a2</code> refer to those same elements but as members of <code>a</code>. Iterators referring to the transferred elements and all iterators referring to <code>a</code> will be invalidated, but iterators to elements remaining in <code>a2</code> will remain valid. Throws: Nothing unless the hash function or key equality predicate throws. </blockquote> It's worth noting that these conditions are strongly reminiscent of those given for the requirements of ordinary associative containers (std::map), given in §26.2.6, table 90, <code>a.merge( a2 )</code>: <blockquote> Requires: <code>a.get_allocator() == a2.get_allocator()</code>. Attempts to extract each element in <code>a2</code> and insert it into <code>a</code> using the comparison object of <code>a</code>. In containers with unique keys, if there is an element in <code>a</code> with key equivalent to the key of an element from <code>a2</code>, then that element is not extracted from <code>a2</code>. Postconditions: Pointers and references to the transferred elements of <code>a2</code> refer to those same elements but as members of <code>a</code>. Iterators referring to the transferred elements will continue to refer to their elements, but they now behave as iterators into <code>a</code>, not into <code>a2</code>. Throws: Nothing unless the comparison object throws. </blockquote> I needed to merge two std::unordered_maps with the same number of elements which I could ensure would be unique across both containers, meaning that the map containing the merged result would have double the number of elements it previously had, and the container merged-from would be empty. This should be perfectly safe thanks to C++17, right? This is a huge win in terms of performance… except, I had this nagging doubt. Hilariously, the postcondition statement says nothing about whether or not the previous max load factor in the merged map would be honored, and while that seems like a safe implicit assumption, it seemed to naïvely conflict with the statement about the unordered_map's exception-safety. If you're using a hash table design wherein the buckets are contiguously allocated buffers, maintaining your load factor would seem to imply rehashing, which would seem to imply reallocation of the bucket buffer. Already, this seemed like an exercise in extreme navel-gazing and there was good cause to leave well enough alone: a more complicated hash table could conceivably be made as a completely node-based structure, similar to the red-black trees usually underpinning std::map, and in such a case, the spec seemed reasonable as a rehash wouldn't imply allocation. Perhaps to my benefit, I succumbed to doubt and dug into gcc-7.1's implementation of merge. It's incredibly complicated, but to summarize my findings, I found that the buckets are, indeed, contiguously allocated buffers, and rehashing does imply reallocation. Maybe, I thought, there was some deeper magic I was missing (I stared at the source for nearly an entire day and still felt I had a poor understanding of it) which just disabled rehashing during a merge, meaning that all of the specified conditions would be upheld, but you could get a nasty performance regression after a suitably large merge, since your map would likely be overloaded. I moved for a practical assessment mirroring my use-case (which I would have presented if it were possible, I'm sorry), instead of just questioning my interpretation of libstdc++: <pre class="prettyprint"><code>#include <memory> // for std::shared_ptr<> #include <new> // for std::bad_alloc #include <utility> // for std::move(), std::pair<> #include <type_traits> // for std::true_type #include <unordered_map> // for std::unordered_map<> #include <functional> // for std::hash<>, std::equal_to<> #include <string> // for std::string #include <iostream> // for std::cout #include <cstddef> // for std::size_t template<typename T> class PrimedFailureAlloc { public: using value_type = T; using propagate_on_container_copy_assignment = std::true_type; using propagate_on_container_move_assignment = std::true_type; using propagate_on_container_swap = std::true_type; PrimedFailureAlloc() = default; template<typename U> PrimedFailureAlloc( const PrimedFailureAlloc& source ) noexcept : m_triggered{ source.m_triggered } { } template<typename U> PrimedFailureAlloc( PrimedFailureAlloc&& source ) noexcept : m_triggered{ std::move( source.m_triggered ) } { } T* allocate( std::size_t n ) { if ( *m_triggered ) throw std::bad_alloc{}; return static_cast<T*>( ::operator new( sizeof( T ) * n ) ); } void deallocate( T* ptr, std::size_t n ) noexcept { ::operator delete( ptr ); } bool operator==( const PrimedFailureAlloc& rhs ) noexcept { return m_triggered == rhs.m_triggered; } void trigger() noexcept { *m_triggered = true; } private: template<typename U> friend class PrimedFailureAlloc; std::shared_ptr<bool> m_triggered{ new bool{ false } }; }; template<typename T> bool operator!=( const PrimedFailureAlloc<T>& lhs, const PrimedFailureAlloc<T>& rhs ) noexcept { return !(lhs == rhs); } template< typename Key , typename T , typename Hash = std::hash<Key> , typename KeyEqual = std::equal_to<Key> > using FailingMap = std::unordered_map< Key, T, Hash, KeyEqual, PrimedFailureAlloc<std::pair<const Key, T>> >; template<typename Key, typename T> void printMap( const FailingMap<Key, T>& map ) { std::cout << "{\n"; for ( const auto& [str, index] : map ) std::cout << " { " << str << ", " << index << " }\n"; std::cout << "}\n"; } int main() { PrimedFailureAlloc<std::pair<const std::string, int>> a; FailingMap<std::string, int> m1{ a }; FailingMap<std::string, int> m2{ a }; m1.insert( { { "Purple", 0 }, { "Green", 3 }, { "Indigo", 16 } } ); m2.insert( { { "Blue", 12 }, { "Red", 2 }, { "Violet", 5 } } ); // m1.reserve( m1.size() + m2.size() ); a.trigger(); m1.merge( m2 ); std::cout << "map :=\n"; printMap( m1 ); return 0; } </code></pre> Sure enough, after compiling this code under GCC-7.1, I get: <pre class="prettyprint"><code>terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc [1] 10944 abort ./a.out </code></pre> Whereas, uncommenting line 95 (<code>m1.reserve( m1.size() + m2.size() );</code>), results in the expected output: <pre class="prettyprint"><code>map := { { Red, 2 } { Violet, 5 } { Purple, 0 } { Green, 3 } { Blue, 12 } { Indigo, 16 } } </code></pre> Understanding that C++17 is still a draft standard which hasn't yet been finalized, and that gcc's implementation is experimental, I suppose my question would be: <ol> <li>Have I gravely misconstrued the safety of the merge operation as set out in the standard? Are there best-practices to using <code>std::unordered_map::merge()</code> that I've missed? Was I supposed to be implicitly responsible for ensuring the allocation of buckets? Will using <code>std::unordered_map::reserve()</code> actually be portable when clang, MSVC, and Intel finally support C++17? I mean, my program aborting when an exception-free merge was impossible does, after some notion, adhere to the listed postconditions…</li> <li>Is this actually a defect in the standard? The similarity of the wording between unordered associative containers and ordinary associative containers might have allowed unintended guarantees to slip in if the text were, say, copy-pasted.</li> <li>Is this just a gcc bug?</li> </ol> It's worth noting that I did check gcc's bug tracker prior to this write-up and seemed to find no open bugs matching my description, and furthermore, I checked the C++ standard defect report and similarly seemed to have come up empty (admittedly, doing a text-search of that site is aggravating and I may have been less than thorough). The latter is unsurprising, since standard defects and their workarounds or consequences are usually noted in gcc's source code and I found no such notations during my examination. I tried compiling my example code in my most recent checkout of clang (over a week old), but the compiler segfaulted so I took my examination there no further, and have not consulted libc++.

This is just a defect in the standard, i.e., your possibility 2. LWG has just moved LWG issue 2977 to "Ready" status, which will strike the erroneous Throws clause.

Safety of std::unordered_map::merge()

Tags:

While writing some code targeting C++17, I kind of hit a stumbling block determining the exception-safety of an operation merging two compatible std::unordered_maps. Per the current working draft, §26.2.7, table 91 reads, in part, regarding the conditions of a.merge( a2 ):

Requires: a.get_allocator() == a2.get_allocator().

Attempts to extract each element in a2 and insert it into a using the hash function and key equality predicate of a. In containers with unique keys, if there is an element in a with key equivalent to the key of an element from a2, then that element is not extracted from a2.

Postconditions: Pointers and references to the transferred elements of a2 refer to those same elements but as members of a. Iterators referring to the transferred elements and all iterators referring to a will be invalidated, but iterators to elements remaining in a2 will remain valid.

Throws: Nothing unless the hash function or key equality predicate throws.

It's worth noting that these conditions are strongly reminiscent of those given for the requirements of ordinary associative containers (std::map), given in §26.2.6, table 90, a.merge( a2 ):

Requires: a.get_allocator() == a2.get_allocator().

Attempts to extract each element in a2 and insert it into a using the comparison object of a. In containers with unique keys, if there is an element in a with key equivalent to the key of an element from a2, then that element is not extracted from a2.

Postconditions: Pointers and references to the transferred elements of a2 refer to those same elements but as members of a. Iterators referring to the transferred elements will continue to refer to their elements, but they now behave as iterators into a, not into a2.

Throws: Nothing unless the comparison object throws.

I needed to merge two std::unordered_maps with the same number of elements which I could ensure would be unique across both containers, meaning that the map containing the merged result would have double the number of elements it previously had, and the container merged-from would be empty. This should be perfectly safe thanks to C++17, right?

This is a huge win in terms of performance… except, I had this nagging doubt. Hilariously, the postcondition statement says nothing about whether or not the previous max load factor in the merged map would be honored, and while that seems like a safe implicit assumption, it seemed to naïvely conflict with the statement about the unordered_map's exception-safety. If you're using a hash table design wherein the buckets are contiguously allocated buffers, maintaining your load factor would seem to imply rehashing, which would seem to imply reallocation of the bucket buffer.

Already, this seemed like an exercise in extreme navel-gazing and there was good cause to leave well enough alone: a more complicated hash table could conceivably be made as a completely node-based structure, similar to the red-black trees usually underpinning std::map, and in such a case, the spec seemed reasonable as a rehash wouldn't imply allocation.

Perhaps to my benefit, I succumbed to doubt and dug into gcc-7.1's implementation of merge. It's incredibly complicated, but to summarize my findings, I found that the buckets are, indeed, contiguously allocated buffers, and rehashing does imply reallocation. Maybe, I thought, there was some deeper magic I was missing (I stared at the source for nearly an entire day and still felt I had a poor understanding of it) which just disabled rehashing during a merge, meaning that all of the specified conditions would be upheld, but you could get a nasty performance regression after a suitably large merge, since your map would likely be overloaded.

I moved for a practical assessment mirroring my use-case (which I would have presented if it were possible, I'm sorry), instead of just questioning my interpretation of libstdc++:

#include <memory>        // for std::shared_ptr<> #include <new>           // for std::bad_alloc #include <utility>       // for std::move(), std::pair<> #include <type_traits>   // for std::true_type #include <unordered_map> // for std::unordered_map<> #include <functional>    // for std::hash<>, std::equal_to<> #include <string>        // for std::string #include <iostream>      // for std::cout #include <cstddef>       // for std::size_t  template<typename T> class PrimedFailureAlloc { public:   using value_type = T;   using propagate_on_container_copy_assignment = std::true_type;   using propagate_on_container_move_assignment = std::true_type;   using propagate_on_container_swap = std::true_type;    PrimedFailureAlloc() = default;    template<typename U>   PrimedFailureAlloc( const PrimedFailureAlloc<U>& source ) noexcept     : m_triggered{ source.m_triggered }   { }    template<typename U>   PrimedFailureAlloc( PrimedFailureAlloc<U>&& source ) noexcept     : m_triggered{ std::move( source.m_triggered ) }   { }    T* allocate( std::size_t n )   {     if ( *m_triggered ) throw std::bad_alloc{};     return static_cast<T*>( ::operator new( sizeof( T ) * n ) );   }    void deallocate( T* ptr, std::size_t n ) noexcept   {     ::operator delete( ptr );   }    bool operator==( const PrimedFailureAlloc& rhs ) noexcept   {     return m_triggered == rhs.m_triggered;   }    void trigger() noexcept { *m_triggered = true; }  private:   template<typename U>   friend class PrimedFailureAlloc;    std::shared_ptr<bool> m_triggered{ new bool{ false } }; };  template<typename T> bool operator!=( const PrimedFailureAlloc<T>& lhs,                  const PrimedFailureAlloc<T>& rhs ) noexcept {   return !(lhs == rhs); }  template< typename Key         , typename T         , typename Hash = std::hash<Key>         , typename KeyEqual = std::equal_to<Key>         > using FailingMap = std::unordered_map<   Key,   T,   Hash,   KeyEqual,   PrimedFailureAlloc<std::pair<const Key, T>> >;  template<typename Key, typename T> void printMap( const FailingMap<Key, T>& map ) {   std::cout << "{\n";   for ( const auto& [str, index] : map )     std::cout << "  { " << str << ", " << index << " }\n";   std::cout << "}\n"; }  int main() {   PrimedFailureAlloc<std::pair<const std::string, int>> a;   FailingMap<std::string, int> m1{ a };   FailingMap<std::string, int> m2{ a };    m1.insert( { { "Purple", 0 }, { "Green", 3 }, { "Indigo", 16 } } );   m2.insert( { { "Blue", 12 }, { "Red", 2 }, { "Violet", 5 } } );    // m1.reserve( m1.size() + m2.size() );   a.trigger();   m1.merge( m2 );    std::cout << "map :=\n";   printMap( m1 );    return 0; }

Sure enough, after compiling this code under GCC-7.1, I get:

terminate called after throwing an instance of 'std::bad_alloc'   what():  std::bad_alloc [1]    10944 abort      ./a.out

Whereas, uncommenting line 95 (m1.reserve( m1.size() + m2.size() );), results in the expected output:

map := {   { Red, 2 }   { Violet, 5 }   { Purple, 0 }   { Green, 3 }   { Blue, 12 }   { Indigo, 16 } }

Understanding that C++17 is still a draft standard which hasn't yet been finalized, and that gcc's implementation is experimental, I suppose my question would be:

Have I gravely misconstrued the safety of the merge operation as set out in the standard? Are there best-practices to using std::unordered_map::merge() that I've missed? Was I supposed to be implicitly responsible for ensuring the allocation of buckets? Will using std::unordered_map::reserve() actually be portable when clang, MSVC, and Intel finally support C++17? I mean, my program aborting when an exception-free merge was impossible does, after some notion, adhere to the listed postconditions…
Is this actually a defect in the standard? The similarity of the wording between unordered associative containers and ordinary associative containers might have allowed unintended guarantees to slip in if the text were, say, copy-pasted.
Is this just a gcc bug?

It's worth noting that I did check gcc's bug tracker prior to this write-up and seemed to find no open bugs matching my description, and furthermore, I checked the C++ standard defect report and similarly seemed to have come up empty (admittedly, doing a text-search of that site is aggravating and I may have been less than thorough). The latter is unsurprising, since standard defects and their workarounds or consequences are usually noted in gcc's source code and I found no such notations during my examination. I tried compiling my example code in my most recent checkout of clang (over a week old), but the compiler segfaulted so I took my examination there no further, and have not consulted libc++.

451

asked Jun 13 '17 19:06

Capital C

1 Answers

This is just a defect in the standard, i.e., your possibility 2.

LWG has just moved LWG issue 2977 to "Ready" status, which will strike the erroneous Throws clause.

118

answered Oct 09 '22 00:10

T.C.

Related questions
                            
                                Why use ++i instead of i++ in cases where the value is not used anywhere else in the statement?
                            
                                Semicolon in C++?
                            
                                Most efficient way to compare a variable to multiple values?
                            
                                Why can't libcudart.so.4 be found when compiling the CUDA samples under Ubuntu?
                            
                                Is there anything like "std::and" or "std::or"?
                            
                                Why do c++ programmers use != instead of <
                            
                                how to query if(T==int) with template class
                            
                                How to rewrite complicated lines of C++ code (nested ternary operator)
                            
                                Interpretation of int (*a)[3]
                            
                                Why does passing the literal 3 choose the int overload instead of the short overload?
                            
                                Visual Studio 2017 can't find windows.h
                            
                                Sort based on multiple things in C++
                            
                                Malloc vs New for Primitives
                            
                                What's your favorite g++ option? [closed]
                            
                                Can I use `abstract` keyword in C++ class
                            
                                Why does this output of the same expression from printf differ from cout?
                            
                                How do you make linux GUI's?
                            
                                What advantages can I get from learning C++ if I'm mainly a C# Programmer? [closed]
                            
                                Are do-while-false loops common?
                            
                                What is a good example of recursion other than generating a Fibonacci sequence?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Safety of std::unordered_map::merge()

Tags:

c++

c++17

exception-safety

gcc7

Capital C

People also ask

1 Answers

T.C.

Recent Activity

Donate For Us