Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unnecessary emptying of moved-from std::string

Both libstdc++ and libc++ makes moved-from std::string object empty, even if the original stored string is short and short string optimization is applied. It seems to me that this emptying makes an additional and unnecessary runtime overhead. For instance, here is the move constructor of std::basic_string from libstdc++:

basic_string(basic_string&& __str) noexcept
  : _M_dataplus(_M_local_data(), std::move(__str._M_get_allocator())) {
    if (__str._M_is_local()) 
      traits_type::copy(_M_local_buf, __str._M_local_buf, _S_local_capacity + 1);
    else {
      _M_data(__str._M_data());
      _M_capacity(__str._M_allocated_capacity);
    }
    _M_length(__str.length());
    __str._M_data(__str._M_local_data());  // (1)
    __str._M_set_length(0);                // (2)
  }

(1) is an assignment that is useless in case of a short string, since data is already set to local data, so we just assign a pointer the same value it has been assigned before.

(2) Emptying string sets the string size and resets the first character in the local buffer, which, as far as I know, the Standard does not demand.

Usually, library implementers tries to implement the Standard as much efficient as it is possible (for instance, deleted memory regions are not zeroed-out). My question is if there might be any particular reasons why moved-from strings are emptied even if it is not required and it adds an unnecessary overhead. Which, can be easily eliminated, e.g., by:

basic_string(basic_string&& __str) noexcept
  : _M_dataplus(_M_local_data(), std::move(__str._M_get_allocator())) {
    if (__str._M_is_local()) {
      traits_type::copy(_M_local_buf, __str._M_local_buf, _S_local_capacity + 1);
      _M_length(__str.length());
    }
    else {
      _M_data(__str._M_data());
      _M_capacity(__str._M_allocated_capacity);
      _M_length(__str.length());
      __str._M_data(__str._M_local_data());  // (1)
      __str._M_set_length(0);                // (2)
    }
  }
like image 867
Daniel Langr Avatar asked Oct 08 '18 06:10

Daniel Langr


People also ask

Can std::string be empty?

std::string::emptyReturns whether the string is empty (i.e. whether its length is 0). This function does not modify the value of the string in any way.

Is std::string initialized to empty?

A std::string isn't a pointer, so it shouldn't be combined. PS. the initializations are the same: the ctor of std::string sets it to the empty string.

Is std::string allocated on the heap?

The string object itself is stored on the stack but it points to memory that is on the heap. Why? The language is defined such that the string object is stored on the stack. string's implementation to construct an object uses memory on the heap.

What happens when you to std :: move?

std::move is used to indicate that an object t may be "moved from", i.e. allowing the efficient transfer of resources from t to another object. In particular, std::move produces an xvalue expression that identifies its argument t . It is exactly equivalent to a static_cast to an rvalue reference type.


1 Answers

In the case of libc++, the string move constructor does empty the source, but it is not unnecessary. Indeed, the author of this string implementation was the same person that led the move semantics proposal for C++11. ;-)

This implementation of the libc++ string was actually designed from the move members outwards!

Here is the code with some unnecessary details (like debug mode) code left out:

template <class _CharT, class _Traits, class _Allocator>
basic_string<_CharT, _Traits, _Allocator>::basic_string(basic_string&& __str)
        _NOEXCEPT
    : __r_(_VSTD::move(__str.__r_))
{
    __str.__zero();
}

In a nutshell, this code copies all of the bytes of the source, and then zeros all of the bytes of the source. One thing to immediately note: There is no branching: this code does the same thing for long and short strings.

Long string mode

In "long mode", the layout is 3 words, a data pointer and two integral types to store size and capacity, minus 1 bit for the long/short flag. Plus an space for an allocator (optimized away for empty allocators).

So this copies the pointer/sizes, and then nulls out the source to release ownership of the pointer. This also sets the source to "short mode" as the short/long bit means short in the zero state. Also all zero bits in the short mode represent a zero-size, non-zero capacity short string.

Short string mode

When the source is a short string, the code is identical: The bytes are copied over, and the source bytes are zeroed out. In short mode there are no self-referencing pointers, and so copying bytes is the correct algorithm.

Now it is true that in "short mode", the zeroing of the 3 words of the source might seem unnecessary, but to do that one would have to check the long/short bit and zero bytes when in long mode. Doing this check-and-branch would actually be more expensive than just zeroing the 3 words because of the occasional branch mis-prediction (breaking the pipeline).

Here is the optimized x86 (64bit) assembly for the libc++ string move constructor.

std::string
test(std::string& s)
{
    return std::move(s);
}

__Z4testRNSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEE: ## @_Z4testRNSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEE
    .cfi_startproc
## %bb.0:
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset %rbp, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register %rbp
    movq    16(%rsi), %rax
    movq    %rax, 16(%rdi)
    movq    (%rsi), %rax
    movq    8(%rsi), %rcx
    movq    %rcx, 8(%rdi)
    movq    %rax, (%rdi)
    movq    $0, 16(%rsi)
    movq    $0, 8(%rsi)
    movq    $0, (%rsi)
    movq    %rdi, %rax
    popq    %rbp
    retq
    .cfi_endproc

(no branches!)

<aside>

The size of the internal buffer for the short string is also optimized for the move members. The internal buffer is "union'ed" with the 3 words required for "long mode", so that the sizeof(string) requires no more space than when in long mode. Despite this compact sizeof (the smallest among the 3 major implementations), libc++ enjoys the largest internal buffer on 64 bit architectures: 22 char.

The small sizeof translates into faster move members since all these members do is copy and zero bytes of the object layout.

See this Stackoverflow answer for more details on the internal buffer size.

</aside>

Summary

So in summary, the setting of the source to an empty string is necessary in "long mode" to transfer ownership of the pointer, and also necessary in short mode for performance reasons to avoid a broken pipeline.

I have no comment on the libstdc++ implementation as I did not author that code and your question already does a good job of that anyway. :-)

like image 195
Howard Hinnant Avatar answered Oct 16 '22 02:10

Howard Hinnant