Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does libc++'s implementation of map use this union?

#if __cplusplus >= 201103L

template <class _Key, class _Tp>
union __value_type
{
    typedef _Key                                     key_type;
    typedef _Tp                                      mapped_type;
    typedef pair<const key_type, mapped_type>        value_type;
    typedef pair<key_type, mapped_type>              __nc_value_type;

    value_type __cc;
    __nc_value_type __nc;

    template <class ..._Args>
    _LIBCPP_INLINE_VISIBILITY
    __value_type(_Args&& ...__args)
        : __cc(std::forward<_Args>(__args)...) {}

    _LIBCPP_INLINE_VISIBILITY
    __value_type(const __value_type& __v)
        : __cc(__v.__cc) {}

    _LIBCPP_INLINE_VISIBILITY
    __value_type(__value_type& __v)
        : __cc(__v.__cc) {}

    _LIBCPP_INLINE_VISIBILITY
    __value_type(__value_type&& __v)
        : __nc(std::move(__v.__nc)) {}

    _LIBCPP_INLINE_VISIBILITY
    __value_type& operator=(const __value_type& __v)
        {__nc = __v.__cc; return *this;}

    _LIBCPP_INLINE_VISIBILITY
    __value_type& operator=(__value_type&& __v)
        {__nc = std::move(__v.__nc); return *this;}

    _LIBCPP_INLINE_VISIBILITY
    ~__value_type() {__cc.~value_type();}
};

#else
// definition for C++03...

It looks like the purpose is to make __value_type assignable and movable while also being able to expose the content as pair<const key_type, mapped_type> (which is the value type of iterators and so on). But I don't see why it needs to be assignable or movable, since I can't see any reason why the implementation would ever need to copy or move nodes inside a map, or indeed to do anything other than construct and destroy them in-place, and reconfigure pointers.

like image 774
Brian Bi Avatar asked Jul 25 '15 05:07

Brian Bi


2 Answers

This is in support of Potatoswatter's answer. I answer as the author of this libc++ code.

Consider:

int
main()
{
    std::map<A, int> m1;
    m1[A{1}] = 1;
    m1[A{2}] = 2;
    m1[A{3}] = 3;
    std::map<A, int> m2;
    m2[A{4}] = 4;
    m2[A{5}] = 5;
    m2[A{6}] = 6;
    std::cout << "start copy assignment\n";
    m2 = m1;
    std::cout << "end copy assignment\n";
}

In this particular circumstance I foresaw the need to both recycle nodes of the map, and reassign the "const" keys to make the recycling of nodes efficient. Therefore

http://cplusplus.github.io/LWG/lwg-defects.html#704

inserted the following wording to allow for recycling of map nodes:

The associative containers meet all of the requirements of Allocator-aware containers (23.2.1 [container.requirements.general]), except for the containers map and multimap, the requirements placed on value_type in Table 93 apply instead directly to key_type and mapped_type. [Note: For example key_type and mapped_type are sometimes required to be CopyAssignable even though the value_type (pair) is not CopyAssignable. — end note]

Thus allowing the containers non-const access to the map's key_type. To date, only libc++ takes advantage of this. If you instrument A in the above example you will get for libc++:

start copy assignment
operator=(const A& a)
operator=(const A& a)
operator=(const A& a)
end copy assignment

For libstdc++ (gcc-5.2.0)

start copy assignment
~A()
A(A const& a)
~A()
A(A const& a)
~A()
A(A const& a)
end copy assignment

For VS-2015:

start copy assignment
~A()
~A()
~A()
A(A const& a)
A(A const& a)
A(A const& a)
end copy assignment

I assert that when A is a type such as int, std::vector or std::string, or a type containing one of these common std types, there is a tremendous performance advantage of assignment over destruction followed by construction. Assignment can take advantage of existing capacity in the lhs, often resulting in a simple memcpy as opposed to a deallocation of memory followed by an allocation of memory.

Note that above ~A() likely implies deallocation of the entire node, not just A (though not necessarily). In any event, the libc++ map copy assignment operator is highly optimized to recycle memory, and permission for that optimization is backed by the C++11 and beyond standards. The union trick is one (not necessarily portable) way of achieving that optimization.

A similar optimization is obtained by this same code for the move assignment operator when the allocator does not propagate on move assignment and the two allocators compare unequal:

clang/libc++:

start move assignment
operator=(A&& a)
operator=(A&& a)
operator=(A&& a)
end move assignment

gcc-5.2.0

start move assignment
~A()
A(A const& a)
~A()
A(A const& a)
~A()
A(A const& a)
~A()
~A()
~A()
end move assignment

VS-2015

start move assignment
~A()
~A()
~A()
A(A const& a)
A(A const& a)
A(A const& a)
end move assignment
like image 126
Howard Hinnant Avatar answered Sep 18 '22 23:09

Howard Hinnant


When you use a custom allocator, it may be necessary to move the map (and its contents) into a new resource pool. In that case, this overload will provide movable access to the keys:

__value_type(__value_type&& __v)
    : __nc(std::move(__v.__nc)) {}

It doesn't matter that keys have been moved-from, since the next thing that happens is freeing all the nodes.

Note, this usage may incur undefined behavior. You can't generally write one member of a union and then read another. Clang and libc++ can do this as long as they can internally guarantee it won't cause a problem (or error diagnosis).

They probably did it this way, though, because there is no good conforming alternative. At least, I can't think of one. The standard requires that value_type::first_type is genuinely const qualified, so even a const_cast is not allowed.

The trick is conforming in the case that key_type and mapped_type are both standard layout, so that std::pair<key_type, mapped_type> and std::pair<key_type const, mapped_type> are layout-compatible, per [class.mem] §9.2/16. It looks a bit odd here because the function is referring to the immediate members __cc and __nc of the union, leaving it to the constructor to access the common subsequence comprising first and second. The requirements for standard-layout types are somewhat restrictive, but many common key and value types (for example, std::string) can potentially meet them.

like image 20
Potatoswatter Avatar answered Sep 19 '22 23:09

Potatoswatter