Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the reason behind the following C char array storage implementation?

What is the implementation reason behind the following char array implementation?

char *ch1 = "Hello"; // Read-only data
/* if we try ch1[1] = ch1[2]; 
we will get **Seg fault** since the value is stored in 
the constant code segment */

char ch2[] = "World"; // Read-write data
/* if we try ch2[1] = ch2[2]; will work. */

According to the book Head first C (page 73,74), the ch2[] array is stored both in constant code segment but also in the function stack. What is the reason behind duplicating both in code and stack memory space? Why the value can be kept only in stack if it is not read-only data?

like image 780
Ashwin Avatar asked Aug 20 '15 06:08

Ashwin


People also ask

What is the purpose of character array in C?

Character Array is used to display the sequence of characters or numbers. Using char array we can store the variable in a memory to corresponding memory address.

How is a char array stored in C?

String literals are stored in C as an array of chars, terminted by a null byte. A null byte is a char having a value of exactly zero, noted as '\0'. Do not confuse the null byte, '\0', with the character '0', the integer 0, the double 0.0, or the pointer NULL.

How do you implement an array in C?

To create an array, define the data type (like int ) and specify the name of the array followed by square brackets []. To insert values to it, use a comma-separated list, inside curly braces: int myNumbers[] = {25, 50, 75, 100};

Why do we prefer char arrays instead of strings to store passwords?

Since Strings are immutable there is no way the contents of Strings can be changed because any change will produce a new String, while if you use a char[] you can still set all the elements as blank or zero. So storing a password in a character array clearly mitigates the security risk of stealing a password. 2.


1 Answers

First, let's clear something up. String literals are not necessarily read-only data, it's just that it's undefined behaviour to try and change them.

It doesn't necessarily have to crash, it may work just fine. But, being undefined behaviour, you shouldn't rely on it if you want you code to run in another implementation, another version of the same implementation, or even next Wednesday.

This may well stem from a time before standards were in place (the original ANSI/ISO mandate was to codify existing practice rather than create a new language). In many implementations, strings would share space for efficiency, such as the code:

char *good = "successful";
char *bad = "unsuccessful";

resulting in:

good---------+
bad--+       |
     |       |
     V       V
   | u | n | s | u | c | c | e | s | s | f | u | l | \0 |

Hence, if you changed one of the characters in good, it would also change bad.

The reason you can do it with something like:

char indifferent[] = "meh";

is that, while good and bad point to a string literal, that statement actually creates a character array big enough to hold "meh" and then copies the data into it1. The copy of the data can be freely changed.

In fact the C99 rationale document explicitly cites this as one of the reasons:

String literals are not required to be modifiable. This specification allows implementations to share copies of strings with identical text, to place string literals in read-only memory, and to perform certain optimizations.

But regardless as to why, the standard is quite clear on the what. From C11 6.4.5 String literals:

7/ It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

For the latter case, this is covered in 6.7.6 Declarators and 6.7.9 Initialisation.


1 Though it's worth noting the the normal "as if" rules apply here (as long as an implementation acts as if it's following the standard, it can do what it pleases).

In other words, if the implementation can detect that you never try to change the data, it can quite happily bypass the copy and use the original.

like image 52
paxdiablo Avatar answered Sep 27 '22 23:09

paxdiablo