Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do compilers allow string literals not to be const?

And where are literals in memory exactly? (see examples below)

I cannot modify a literal, so it would supposedly be a const char*, although the compiler let me use a char* for it, I have no warnings even with most of the compiler flags.

Whereas an implicit cast of a const char* type to a char* type gives me a warning, see below (tested on GCC, but it behaves similarly on VC++2010).

Also, if I modify the value of a const char (with a trick below where GCC would better give me a warning for), it gives no error and I can even modify and display it on GCC (even though I guess it is still an undefined behavior, I wonder why it did not do the same with the literal). That is why I am asking where those literal are stored, and where are more common const supposedly stored?

const char* a = "test";
char* b = a; /* warning: initialization discards qualifiers 
  from pointer target type (on gcc), error on VC++2k10 */

char *c = "test"; // no compile errors
c[0] = 'p'; /* bus error when execution (we are not supposed to 
  modify const anyway, so why can I and with no errors? And where is the 
  literal stored for I have a "bus error"? 
  I have 'access violation writing' on VC++2010 */

const char d = 'a';
*(char*)&d = 'b'; // no warnings (why not?)
printf("%c", d);  /* displays 'b' (why doesn't it do the same
  behavior as modifying a literal? It displays 'a' on VC++2010 */
like image 790
Dpp Avatar asked Jun 19 '10 09:06

Dpp


People also ask

Are string literals const?

String constants, also known as string literals, are a special type of constants which store fixed sequences of characters. A string literal is a sequence of any number of characters surrounded by double quotes: "This is a string."

How does compiler process string literal?

The compiler scans the source code file, looks for, and stores all occurrences of string literals. It can use a mechanism such as a lookup table to do this. It then runs through the list and assigns the same address to all identical string literals.

Are string literals immutable in C++?

The value of a String object is an immutable (read-only) sequence of char16 (16-bit Unicode) characters. Because a String object is immutable, assignment of a new string literal to a String variable actually replaces the original String object with a new String object.

How is a string literal stored in the memory?

The characters of a literal string are stored in order at contiguous memory locations. An escape sequence (such as \\ or \") within a string literal counts as a single character. A null character (represented by the \0 escape sequence) is automatically appended to, and marks the end of, each string literal.


1 Answers

I'm not certain about what C/C++ standards stand for about strings. But I can tell exactly what actually happens with string literals in MSVC. And, I believe, other compilers behave similarly.

String literals reside in a const data section. Their memory is mapped into the process address space. However the memory pages they're stored in are ead-only (unless explicitly modified during the run).

But there's something more you should know. Not all the C/C++ expressions containing quotes have the same meaning. Let's clarify everything.

const char* a = "test";

The above statement makes the compiler create a string literal "test". The linker makes sure it'll be in the executable file. In the function body the compiler generates a code that declares a variable a on the stack, which gets initialized by the address of the string literal "test.

char* b = a;

Here you declare another variable b on the stack which gets the value of a. Since a pointed to a read-only address - so would b. The even fact b has no const semantics doesn't mean you may modify what it points on.

char *c = "test"; // no compile errors
c[0] = 'p';

The above generates an access violation. Again, the lack of const doesn't mean anything at the machine level

const char d = 'a';
*(char*)&d = 'b';

First of all - the above is not related to string literals. 'a' is not a string. It's a character. It's just a number. It's like writing the following:

const int d = 55;
*(int*)&d = 56;

The above code makes a fool out of compiler. You say the variable is const, however you manage to modify it. But this is not related to the processor exception, since d resides in the read/write memory nevertheless.

I'd like to add one more case:

char b[] = "test";
b[2] = 'o';

The above declares an array on the stack, and initializes it with the string "test". It resides in the read/write memory, and can be modified. There's no problem here.

like image 168
valdo Avatar answered Oct 22 '22 08:10

valdo