Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is `string.assign(string.data(), 5)` well-defined or UB?

A coworker wanted to write this:

std::string_view strip_whitespace(std::string_view sv);

std::string line = "hello  ";
line = strip_whitespace(line);

I said that returning string_view made me uneasy a priori, and furthermore, the aliasing here looked like UB to me.

I can say with certainty that line = strip_whitespace(line) in this case is equivalent to line = std::string_view(line.data(), 5). I believe that will call string::operator=(const T&) [with T=string_view], which is defined to be equivalent to line.assign(const T&) [with T=string_view], which is defined to be equivalent to line.assign(line.data(), 5), which is defined to do this:

Preconditions: [s, s + n) is a valid range.
Effects: Replaces the string controlled by *this with a copy of the range [s, s + n).
Returns: *this.

But this doesn't say what happens when there's aliasing.

I asked this question on the cpplang Slack yesterday and got mixed answers. Looking for super authoritative answers here, and/or empirical analysis of real library vendors' implementations.


I wrote test cases for string::assign, vector::assign, deque::assign, list::assign, and forward_list::assign.

  • Libc++ makes all of these test cases work.
  • Libstdc++ makes them all work except for forward_list, which segfaults.
  • I don't know about MSVC's library.

The segfault in libstdc++ gives me hope that this is UB; but I also see both libc++ and libstdc++ going to great effort to make this work at least in the common cases.

like image 659
Quuxplusone Avatar asked Feb 13 '20 16:02

Quuxplusone


People also ask

How do you assign a string?

String assignment is performed using the = operator and copies the actual bytes of the string from the source operand up to and including the null byte to the variable on the left-hand side, which must be of type string. You can create a new variable of type string by assigning it an expression of type string.

How do I assign a string to another string?

Using the inbuilt function strcpy() from string. h header file to copy one string to the other. strcpy() accepts a pointer to the destination array and source array as a parameter and after copying it returns a pointer to the destination string.

Can we assign a string to another string in C++?

Yes! It's default constructing a string, then assigning it from a const char* .


1 Answers

Barring a couple of exceptions of which yours is not one, calling a non-const member function (i.e. assign) on a string invalidates [...] pointers [...] to its elements. This violates the precondition on assign that [s, s + n) is a valid range, so this is undefined behavior.

Note that string::operator=(string const&) has language specifically to make self-assignment a no-op.

like image 195
ecatmur Avatar answered Oct 24 '22 12:10

ecatmur