Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I avoid wasteful copying of keys in a B-tree based STL-like map?

I'm replacing a use of std::map in a hot path with cpp-btree's btree_map. But with optimization enabled, GCC and Clang complain about a strict aliasing violation. The problem boils down to this:

template <typename Key, typename Value>
class btree_map {
public:
    // In order to match the standard library's container interfaces
    using value_type = std::pair<const Key, Value>;

private:
    using mutable_value_type = std::pair<Key, Value>;

    struct node_type {
        mutable_value_type values[N];
        // ...
    };

public:
    class iterator {
        // ...

        value_type& operator*() {
            // Here we cast from const std::pair<Key, Value>&
            // to const std::pair<const Key, Value>&
            return reinterpret_cast<value_type&>(node->values[i]);
        }
    };

    std::pair<iterator, bool> insert(const value_type& value) {
        // ...
        // At this point, we have to insert into the middle of a node.
        // Here we rely on nodes containing mutable_value_type, because
        // value_type isn't assignable due to the const Key member
        std::move_backward(node->values + i, node->values + j,
                           node->values + j + 1);
        node->values[i] = value;
        // ...
    }
};

This got me thinking, is there a way to do this as efficiently that doesn't rely on undefined behaviour? The keys I'm using are efficiently moveable but fairly slow to copy, so I'd love to avoid copying many keys on every insertion. I've considered

  • Using value_type values[N], then const_cast<Key&>(values[i].first) = std::move(key) to move the key around, but I'm pretty sure that's still undefined
  • Returning std::pair<const Key&, Value&> instead of std::pair<const Key, Value>& when appropriate, but I'm not sure this would still satisfy the container requirements (I hear ...::reference is supposed to really be a reference type)
  • Not caring. The code works as-is, but I'm curious if it can be done in a standard-compliant way. There's also the chance that future compilers do different things with the UB, and I don't know of a way to apply -fno-strict-aliasing to only a single class.

Any other ideas?

like image 434
Tavian Barnes Avatar asked Jan 21 '15 17:01

Tavian Barnes


2 Answers

Quoting from strict aliasing rules,

If a program attempts to access the stored value of an object through an lvalue of other than one of the following types the behavior is undefined:

  • ...

  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), ...

Hence going from std::pair<Key,Value> to std::pair<const Key, Value> via intermediate cast to a union or a struct containing both types as members won't break strict aliasing rules.

Caveat: std::pair in a union is not permited until C++11, can use a structure instead.

Caveat: assumption that the two pair types have compatible layouts may not hold true. Imagine an implementation that orders first and second differently depending on the constness of the Key type.

like image 147
Nick Zavaritsky Avatar answered Nov 01 '22 18:11

Nick Zavaritsky


Modified: expanded move_backward(...) into for-loop with explicit destructor call and placement new to avoid assignment error.

Placement new can be used instead of simple assignment.

Note: this implementation below is not exception-safe. Additional code is required for exception-safety.

template <typename Key, typename Value>
class btree_map {
// ...
private:
    struct node_type {
        // Declare as value_type, not mutable_value_type.
        value_type values[N];
        // ...
    };

    class iterator {
        // ...

        value_type& operator*() {
            // Simply return values[i].
            return node->values[i];
        }
    };

    std::pair<iterator, bool> insert(const value_type& value) {
        // ...
        // expand move_backward(...)
        for(size_t k = j + 1; k > i; k--) {
            // Manually delete the previous value prior to assignment.
            node->values[k].~value_type();
            // Assign the new value by placement new.
            // Note: it goes wrong if an exception occurred at this point.
            new(&node->values[k]) value_type(node->values[k - 1]);
        }
        // Matual delete and placement new too.
        node->values[i].~value_type();
        // Note: it goes wrong if an exception occurred at this point.
        new (&node->values[i]) value_type(value);
        // ...
    }
};
like image 41
yoh2 Avatar answered Nov 01 '22 19:11

yoh2