I have recently seen a colleague of mine using std::string
as a buffer:
std::string receive_data(const Receiver& receiver) {
std::string buff;
int size = receiver.size();
if (size > 0) {
buff.resize(size);
const char* dst_ptr = buff.data();
const char* src_ptr = receiver.data();
memcpy((char*) dst_ptr, src_ptr, size);
}
return buff;
}
I guess this guy wants to take advantage of auto destruction of the returned string so he needs not worry about freeing of the allocated buffer.
This looks a bit strange to me since according to cplusplus.com the data()
method returns a const char*
pointing to a buffer internally managed by the string:
const char* data() const noexcept;
Memcpy-ing to a const char pointer? AFAIK this does no harm as long as we know what we do, but have I missed something? Is this dangerous?
h functions when you are declaring string with std::string keyword because std::string strings are of basic_string class type and cstring strings are of const char* type. Pros: When dealing exclusively in C++ std:string is the best way to go because of better searching, replacement, and manipulation functions.
Use std::string when you need to store a value. Use const char * when you want maximum flexibility, as almost everything can be easily converted to or from one.
There is no functionality difference between string and std::string because they're the same type.
While std::string has the size of 24 bytes, it allows strings up to 22 bytes(!!) with no allocation.
std::string
as a buffer.It is bad practice to use std::string
as a buffer, for several reasons (listed in no particular order):
std::string
was not intended for use as a buffer; you would need to double-check the description of the class to make sure there are no "gotchas" which would prevent certain usage patterns (or make them trigger undefined behavior).data()
- it's const Tchar *
; so your code would cause undefined behavior. (But &(str[0])
, &(str.front())
, or &(*(str.begin()))
would work.)std::string
s for buffers is confusing to readers of your function's definition, who assume you would be using std::string
for, well, strings. In other words, doing so breaks the Principle of Least Astonishment.std::unique_ptr
would be fine for your case, or even std::vector
. In C++17, you can use std::byte
for the element type, too. A more sophisticated option is a class with an SSO-like feature, e.g. Boost's small_vector
(thank you, @gast128, for mentioning it).std::string
to conform to the C++11 standard, so in some cases (which by now are rather unlikely), you might run into some linkage or runtime issues that you wouldn't with a different type for your buffer.Also, your code may make two instead of one heap allocations (implementation dependent): Once upon string construction and another when resize()
ing. But that in itself is not really a reason to avoid std::string
, since you can avoid the double allocation using the construction in @Jarod42's answer.
You can completely avoid a manual memcpy
by calling the appropriate constructor:
std::string receive_data(const Receiver& receiver) {
return {receiver.data(), receiver.size()};
}
That even handles \0
in a string.
BTW, unless content is actually text, I would prefer std::vector<std::byte>
(or equivalent).
Memcpy-ing to a const char pointer? AFAIK this does no harm as long as we know what we do, but is this good behavior and why?
The current code may have undefined behavior, depending on the C++ version. To avoid undefined behavior in C++14 and below take the address of the first element. It yields a non-const pointer:
buff.resize(size);
memcpy(&buff[0], &receiver[0], size);
I have recently seen a colleague of mine using
std::string
as a buffer...
That was somewhat common in older code, especially circa C++03. There are several benefits and downsides to using a string like that. Depending on what you are doing with the code, std::vector
can be a bit anemic, and you sometimes used a string instead and accepted the extra overhead of char_traits
.
For example, std::string
is usually a faster container than std::vector
on append, and you can't return std::vector
from a function. (Or you could not do so in practice in C++98 because C++98 required the vector to be constructed in the function and copied out). Additionally, std::string
allowed you to search with a richer assortment of member functions, like find_first_of
and find_first_not_of
. That was convenient when searching though arrays of bytes.
I think what you really want/need is SGI's Rope class, but it never made it into the STL. It looks like GCC's libstdc++ may provide it.
There a lengthy discussion about this being legal in C++14 and below:
const char* dst_ptr = buff.data();
const char* src_ptr = receiver.data();
memcpy((char*) dst_ptr, src_ptr, size);
I know for certain it is not safe in GCC. I once did something like this in some self tests and it resulted in a segfault:
std::string buff("A");
...
char* ptr = (char*)buff.data();
size_t len = buff.size();
ptr[0] ^= 1; // tamper with byte
bool tampered = HMAC(key, ptr, len, mac);
GCC put the single byte 'A'
in register AL
. The high 3-bytes were garbage, so the 32-bit register was 0xXXXXXX41
. When I dereferenced at ptr[0]
, GCC dereferenced a garbage address 0xXXXXXX41
.
The two take-aways for me were, don't write half-ass self tests, and don't try to make data()
a non-const pointer.
From C++17, data
can return a non const char *
.
Draft n4659 declares at [string.accessors]:
const charT* c_str() const noexcept; const charT* data() const noexcept; .... charT* data() noexcept;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With