Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Initialize std::string from a possibly NULL char pointer

Initializing std::string from a NULL char pointer is undefined behaviour, I believe. So, here are alternative versions of a constructor, where mStdString is a member variable of type std::string:

void MyClass::MyClass(const char *cstr) :     mStdString( cstr ? cstr : "") {}  void MyClass::MyClass(const char *cstr) :     mStdString(cstr ? std::string(cstr) : std::string()) {}  void MyClass::MyClass(const char *cstr) {     if (cstr) mStdString = cstr;     // else keep default-constructed mStdString } 

Edit, constructor declaration inside class MyClass:

MyClass(const char *cstr = NULL); 

Which of these, or possibly something else, is the best or most proper way to initialize std::string from a possibly NULL pointer, and why? Is it different for different C++ standards? Assume normal release build optimization flags.

I'm looking for an answer with explanation of why a way is the right way, or an answer with a reference link (this also applies if answer is "doesn't matter"), not just personal opinions (but if you must, at least make it just a comment).

like image 404
hyde Avatar asked Jul 04 '13 07:07

hyde


People also ask

How do I initialize a string to null?

To initialize an empty string in Python, Just create a variable and don't assign any character or even no space. This means assigning “” to a variable to initialize an empty string.

How do I assign a null to a char pointer?

Its fine only in that APi where text is assigned to NULL . In that case I would suggest to use memcpy() . For e.g after assigning text to NULL do memcpy(arStruct->myString, text, strlen(text)+1); assuming struct as argument & char array is member of that struct.


1 Answers

The last one is silly because it doesn't use initialization when it could.

The first two are completely identical semantically (think of the c_str() member function), so prefer the first version because it is the most direct and idiomatic, and easiest to read.

(There would be a semantic difference if std::string had a constexpr default constructor, but it doesn't. Still, it's possible that std::string() is different from std::string(""), but I don't know any implementations that do this, since it doesn't seem to make a lot of sense. On the other hand, popular small-string optimizations nowadays mean that both versions will probably not perform any dynamic allocation.)


Update: As @Jonathan points out, the two string constructors will probably execute different code, and if that matters to you (though it really shouldn't), you might consider a fourth version:

: cstr ? cstr : std::string() 

Both readable and default-constructing.


Second update: But prefer cstr ? cstr : "". As you can see below, when both branches call the same constructor, this can be implemented very efficiently using conditional moves and no branches. (So the two versions do indeed generate different code, but the first one is better.)


For giggles, I've run both versions through Clang 3.3, with -O3, on x86_64, for a struct foo; like yours and a function foo bar(char const * p) { return p; }:

Default constructor (std::string()):

    .cfi_offset r14, -16     mov     R14, RSI     mov     RBX, RDI     test    R14, R14     je      .LBB0_2     mov     RDI, R14     call    strlen     mov     RDI, RBX     mov     RSI, R14     mov     RDX, RAX     call    _ZNSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEE6__initEPKcm     jmp     .LBB0_3 .LBB0_2:     xorps   XMM0, XMM0     movups  XMMWORD PTR [RBX], XMM0     mov     QWORD PTR [RBX + 16], 0 .LBB0_3:     mov     RAX, RBX     add     RSP, 8     pop     RBX     pop     R14     ret 

Empty-string constructor (""):

    .cfi_offset r14, -16     mov     R14, RDI     mov     EBX, .L.str     test    RSI, RSI     cmovne  RBX, RSI     mov     RDI, RBX     call    strlen     mov     RDI, R14     mov     RSI, RBX     mov     RDX, RAX     call    _ZNSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEE6__initEPKcm     mov     RAX, R14     add     RSP, 8     pop     RBX     pop     R14     ret  .L.str:     .zero    1     .size    .L.str, 1 

In my case, it would even appear that "" generates better code: Both versions call strlen, but the empty-string version doesn't use any jumps, only conditional moves (since the same constructor is called, just with two different arguments). Of course that's a completely meaningless, non-portable and non-transferable observation, but it just goes to show that the compiler doesn't always need as much help as you might think. Just write the code that looks best.

like image 134
Kerrek SB Avatar answered Sep 28 '22 10:09

Kerrek SB