Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Double standard? Why only a warning for char* const& a = "bla"?

After trying to delve a bit into the mechanics behind cases such as this question brings into light, I still don't understand why the third line in the code below generates only a warning while the second line is an error.

int main()
{
    const char* const& a = "bla"; // Valid code
    const char*& a2 = "bla"; // Invalid code
    char* const& a3 = "bla"; // Should be invalid but settles for a warning

    return 0;
}

I know that while the reference initialization is converting the string literal to a pointer reference then it shouldn't be dropping any cv-qualifiers the object has, and as the converted type is const char* const (converted from the string literal "bla", i.e. const char[4]) it seems to be of the same case as the second line. The only difference being that the const being dropped belongs to the C string itself and not to the pointer.

Reproduces on both GCC 8.2 and Clang 6.0.0 without specifying any extra conformance flags.

Output from gcc:

<source>:4:23: error: cannot bind non-const lvalue reference of type 'const char*&' to an rvalue of type 'const char*'
     const char*& a2 = "Some other string literal";
                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~

<source>:5:23: warning: ISO C++ forbids converting a string constant to 'char*' [-Wwrite-strings]
     char* const& a3 = "Yet another string literal";

Why are current day compilers conforms to the first case but not to the second? Or alternatively, is there a fundamental difference I'm missing here between the two cases?

like image 435
Geezer Avatar asked Aug 30 '18 20:08

Geezer


People also ask

Why char* is a string?

char *A is a character pointer. it's another way of initializing an array of characters, which is what a string is. char A, on the other hand, is a single char. it can't be more than one char.

What is* char in C?

The abbreviation char is used as a reserved keyword in some programming languages, such as C, C++, C#, and Java. It is short for character, which is a data type that holds one character (letter, number, etc.) of data. For example, the value of a char variable could be any one-character value, such as 'A', '4', or '#'.

Why do we use const char?

If you don't have the choice, using const char* gives a guarantee to the user that you won't change his data especially if it was a string literal where modifying one is undefined behavior. Show activity on this post. By using const you're promising your user that you won't change the string being passed in.

What does const char * const mean?

const char* const says that the pointer can point to a constant char and value of int pointed by this pointer cannot be changed. And we cannot change the value of pointer as well it is now constant and it cannot point to another constant char.


3 Answers

String literals are arrays. The type of "bla" is const char [4].

const char* const& a = "bla";

This is valid because there's a conversion from T [] to T *; in this case you get a const char * rvalue. This rvalue can be bound to a reference because it's a reference to const (which keeps temporaries alive, etc).

const char*& a2 = "bla";

Invalid because here you're trying to bind a temporary value to a non-const reference.

char* const& a3 = "bla";

This is a reference-to-const, but of the wrong type (it's a pointer-to-char, not pointer-to-const-char). This conversion drops a const qualifier, so it should be invalid. Some C++ compilers allow this for backwards compatibility reasons: In C string literals have a non-const-qualified type (i.e. "bla" would be a char [4]), so making this a hard error would break lots of existing code.

Even in C++ this used to be legal. Before C++11, assigning a string literal to a char * (not const char *) variable was still allowed (but deprecated).

The "double standard" is because binding a non-const reference to a temporary was never allowed (C doesn't even have references), so there's no backwards compatibility problem there. The standard doesn't distinguish between "errors" and "warnings"; it's at the discretion of the compiler writer whether compilation should succeed for any given violation of the rules.

like image 145
melpomene Avatar answered Sep 28 '22 10:09

melpomene


Both cases are ill-formed. However, the standard doesn't require compilers to refuse ill-formed programs. So, settling for a warning is fully conforming to the standard.

Or alternatively, is there a fundamental different I'm missing here between the two cases?

The main difference is that binding a non-const lvalue reference to an rvalue has never been well-formed, while implicitly converting const char* to char* used to be well-formed until C++11. Backwards compatibility is a good argument for allowing the latter.

like image 45
eerorika Avatar answered Sep 28 '22 10:09

eerorika


Let's deconstruct this using EAST const syntax.

The rule of const is that it always applies to what is on the left of it, unless there is nothing to the left of it, in which case it applies to what is immediately on the right. With EAST const, we always write const on the right.

So let's look at the code:

const char* const& a = "bla"; // Valid code

becomes

char const * const & a = "bla";

so the char is constant and can't be changed.

The pointer to the character is constant and can't be changed either.

Overall: this is a reference to a pointer that can't be changed to a character that can't be changed.

"bla" is a const C-style array, which decays immediately to a char const * const.

The reason for it being "char const * const" rather than "char const *" is because the address of "bla" is constant - the string "bla" is compiled into the execution code somewhere at a fixed location and when loaded into memory will stay at that memory address till program termination.

So now we have matching types except the reference.

T &a = something; will always work if something is of type T and the something has an address (it does).

Let's look at the second one:

const char*& a2 = "bla"; 

EAST const syntax:

char const * & a2 = "bla";

"bla" is of type:

char const * const

These are not matching types ("bla"'s memory location is fixed).

Maybe this code will make it clearer:

char const *stringPtr = "hello";
char const *stringPtr2 = "world";

char const * &stringPtrRef = stringPtr;

std::cout << stringPtr << std::endl;

stringPtrRef = stringPtr2;

std::cout << stringPtr << std::endl;

This will print "Hello" in the first line and "World" in the second. This is because what stringPtr points to changes.

Since the location of "bla" is fixed, we cannot construct a reference to it where the location of "bla" could be changed by setting the reference to it to something else. It's just not possible. There is also no possible cast that we could use to force it to become the right type.

This is why it can't compile even with warnings.

Let's look at the third one:

char* const& a3 = "bla";

This is already in EAST const format.

with "char * const &" - the resulting reference, while not allowing to change the memory location, would allow you to modify "bla" to "abc".

Maybe in some instances, you actually want to do that to save memory space on some embedded systems where "bla" was only used as an initialization and never again.

The message makes sense:

"warning: ISO C++ forbids converting a string constant to 'char*"

because this is essentially the same as:

char const *s1 = "bla";
char *s2 = s1;

which would actually compile with a warning with the correct compiler flags (-fpermissive).

Even without the -fpermissive, we could change the code to do a cast and make it work.

So, I understand why it can compile, but I think this should be an error. ISO C++ clearly forbids it. My opinion: Require a cast if this is actually really what you want to do.

like image 37
chaospower Avatar answered Sep 28 '22 10:09

chaospower