Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Struct vs string literals? Read only vs read-write? [duplicate]

Does the C99 standard permit writing to compound literals (structs)? It seems it doesn't provide writing to literal strings. I ask about this because it says in C Programming: A Modern Approach, 2nd Edition on Page 406.

Q. Allowing a pointer to a compound literal would seem to make it possible to modify the literal. Is that the case?

A. Yes. Compound literals are lvalues that can be modified.

But, I don't quite get how that works, and how that works with string literals which you certainly can't modify.

char *foo = "foo bar";
struct bar { char *a; int g; };
struct bar *baz = &(struct bar){.a = "foo bar", .g = 5};

int main () {
  // Segfaults
  // (baz->a)[0] = 'X';
  // printf( "%s", baz->a );

  // Segfaults
  // foo[0] = 'a';
  // printf("%s", foo);

  baz->g = 9;
  printf("%d", baz->g);

  return 0;
}

You can see on my list of things that segfault, writing to baz->a causes a segfault. But, writing to baz->g does not. Why is that one of them would cause a segfault and not the other one? How are struct-literals different from string-literals? Why would struct-literals not also be put into read-only section of memory and is the behavior defined or undefined for both of these (standards question)?

like image 817
NO WAR WITH RUSSIA Avatar asked Aug 23 '18 22:08

NO WAR WITH RUSSIA


2 Answers

First thing first: your struct literal has a pointer member initialized to a string literal. The members of the struct itself are writeable, including the pointer member. It is only the content of the string literal that is not writeable.

String literals were part of the language since the beginning, while struct literals (officially known as compound literals) are a relatively recent addition, as of C99. By that time many implementations existed that placed string literals in read-only memory, especially on embedded systems with tiny amounts of RAM. By then designers of the standard had a choice of requiring string literals to be moved to a writeable location, allowing struct literals to be read-only, or leaving things as-is. None of the three solutions was ideal, so it looks like they went on the path of least resistance, and left everything the way it is.

Does the C99 standard permit writing to compound literals (structs)?

C99 standard does not explicitly prohibit writing to data objects initialized with compound literals. This is different from string literals, whose modification is considered undefined behavior by the standard.

like image 103
Sergey Kalinichenko Avatar answered Sep 23 '22 13:09

Sergey Kalinichenko


The standard essentially defines the same characteristics to string literals and to compound literals with a const-qualified type used outside the body of a function.

Lifetime

  • String literals: Always static.

    §6.4.5p6 In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence.

  • Compound literals: Automatic if used inside a function body, otherwise static.

    §6.5.2.5p5 The value of the compound literal is that of an unnamed object initialized by the initializer list. If the compound literal occurs outside the body of a function, the object has static storage duration; otherwise, it has automatic storage duration associated with the enclosing block.

Possibly shared

  • Both string literals and const-qualified compound literals might be shared. You should be prepared for the possibility but cannot rely on it happening.

§6.4.5p7 It is unspecified whether [the arrays created for the string literals] are distinct provided their elements have the appropriate values.

§6.5.2.5p7 String literals, and compound literals with const-qualified types, need not designate distinct objects.

Mutability

  • Modifying either a string literal or a const-qualified compound literal is undefined behaviour. Indeed attempting to modify any const-qualified object is undefined behaviour, although the wording of the standard is probably subject to hair-splitting.

§6.4.5p7 If the program attempts to modify [the array containing a string literal], the behavior is undefined.

§6.7.3p6 If an attempt is made to modify an object defined with a const-qualified type through use of an lvalue with non-const-qualified type, the behavior is undefined.

  • A non-const-qualified compound literal can be freely modified. I don't have a quote for this, but the fact that modification is not explicitly prohibited seems to me to be definitive. It's not necessary to explicitly say that mutable objects may be mutated.

The fact that the lifetime of compound literals inside function bodies is automatic can lead to subtle bugs:

/* This is fine */
const char* foo(void) {
  return "abcde";
}

/* This is not OK */
const int* oops(void) {
  return (const int[]){1, 2, 3, 4, 5};
;
like image 26
rici Avatar answered Sep 24 '22 13:09

rici