Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using C functions to manipulate std::string

Sometimes you need to fill an std::string with characters constructed by a C function. A typical example is this:

constexpr static BUFFERSIZE{256};
char buffer[BUFFERSIZE];
snprint (buffer, BUFFERSIZE, formatstring, value1, value2);
return std::string(buffer);

Notice how we first need to fill a local buffer, and then copy it to the std::string.

The example becomes more complex if the maximum buffersize is calculated and not necessarily something you want to store on the stack. For example:

constexpr static BUFFERSIZE{256};
if (calculatedBufferSize>BUFFERSIZE)
   {
   auto ptr = std::make_unique<char[]>(calculatedBufferSize);
   snprint (ptr.get(), calculatedBufferSize, formatstring, value1, value2);
   return std::string(ptr.get());
   }
else
   {
   char buffer[BUFFERSIZE];
   snprint (buffer, BUFFERSIZE, formatstring, value1, value2);
   return std::string(buffer);
   }

This makes the code even more complex, and if the calculatedBufferSize is larger than what we want on the stack, we essentially do the following:

  • allocate memory (make_unique)
  • fill the memory with the wanted result
  • allocate memory (std::string)
  • copy memory to the string
  • deallocate memory

Since C++17 std::string has a non-const data() method, implying that this is the way to manipulate strings. So it seems tempting to do this:

std::string result;
result.resize(calculatedBufferSize);
snprint (result.data(), calculatedBufferSize, formatstring, value1, value2);
result.resize(strlen(result.c_str()));
return result;

My experiments show that the last resize is needed to make sure that the length of the string is reported correctly. std::string::length() does not search for a nul-terminator, it just returns the size (just like std::vector does).

Notice that we have much less allocation and copying going on:

  • allocate memory (resize string)
  • fill the memory with the wanted result

To be honest, although it seems to be much more efficient, it also looks very 'un-standard' to me. Can somebody indicate whether this is behavior allowed by the C++17 standard? Or is there another way to have this kind of manipulations in a more efficient way?

Please don't refer to question Manipulating std::string, as that question is about much more dirty logic (even using memset). Also don't answer that I must use C++ streams (std::string_stream, efficient?, honestly?). Sometimes you simply have efficient logic in C that you want to reuse.

like image 437
Patrick Avatar asked Dec 18 '22 19:12

Patrick


1 Answers

Modifying the contents pointed to by data() is fine, assuming you do not set the value at data() + size() to anything other than the null character. From [string.accessors]:

charT* data() noexcept;

Returns: A pointer p such that p + i == addressof(operator[](i)) for each i in [0, size()].

Complexity: Constant time.

Remarks: The program shall not modify the value stored at p + size() to any value other than charT(); otherwise, the behavior is undefined.


The statement result.resize(strlen(result.c_str())); does look a bit odd, though. std::snprintf returns the number of characters written; using that value to resize the string would be more appropriate. Additionally, it looks slightly neater to construct the string with the correct size instead of constructing an empty one that is immediately resized:

std::string result(maxlen, '\0');
result.resize(std::max(0, std::snprintf(result.data(), maxlen, fmt, value1, value2)));
return result;
like image 152
You Avatar answered Dec 28 '22 23:12

You