What is the implementation reason behind the following char array implementation?
char *ch1 = "Hello"; // Read-only data
/* if we try ch1[1] = ch1[2];
we will get **Seg fault** since the value is stored in
the constant code segment */
char ch2[] = "World"; // Read-write data
/* if we try ch2[1] = ch2[2]; will work. */
According to the book Head first C (page 73,74), the ch2[]
array is stored both in constant code segment but also in the function stack.
What is the reason behind duplicating both in code and
stack memory space?
Why the value can be kept only in stack if it is not read-only data?
Character Array is used to display the sequence of characters or numbers. Using char array we can store the variable in a memory to corresponding memory address.
String literals are stored in C as an array of chars, terminted by a null byte. A null byte is a char having a value of exactly zero, noted as '\0'. Do not confuse the null byte, '\0', with the character '0', the integer 0, the double 0.0, or the pointer NULL.
To create an array, define the data type (like int ) and specify the name of the array followed by square brackets []. To insert values to it, use a comma-separated list, inside curly braces: int myNumbers[] = {25, 50, 75, 100};
Since Strings are immutable there is no way the contents of Strings can be changed because any change will produce a new String, while if you use a char[] you can still set all the elements as blank or zero. So storing a password in a character array clearly mitigates the security risk of stealing a password. 2.
First, let's clear something up. String literals are not necessarily read-only data, it's just that it's undefined behaviour to try and change them.
It doesn't necessarily have to crash, it may work just fine. But, being undefined behaviour, you shouldn't rely on it if you want you code to run in another implementation, another version of the same implementation, or even next Wednesday.
This may well stem from a time before standards were in place (the original ANSI/ISO mandate was to codify existing practice rather than create a new language). In many implementations, strings would share space for efficiency, such as the code:
char *good = "successful";
char *bad = "unsuccessful";
resulting in:
good---------+
bad--+ |
| |
V V
| u | n | s | u | c | c | e | s | s | f | u | l | \0 |
Hence, if you changed one of the characters in good
, it would also change bad
.
The reason you can do it with something like:
char indifferent[] = "meh";
is that, while good
and bad
point to a string literal, that statement actually creates a character array big enough to hold "meh"
and then copies the data into it1. The copy of the data can be freely changed.
In fact the C99 rationale document explicitly cites this as one of the reasons:
String literals are not required to be modifiable. This specification allows implementations to share copies of strings with identical text, to place string literals in read-only memory, and to perform certain optimizations.
But regardless as to why, the standard is quite clear on the what. From C11 6.4.5 String literals
:
7/ It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.
For the latter case, this is covered in 6.7.6 Declarators
and 6.7.9 Initialisation
.
1 Though it's worth noting the the normal "as if" rules apply here (as long as an implementation acts as if it's following the standard, it can do what it pleases).
In other words, if the implementation can detect that you never try to change the data, it can quite happily bypass the copy and use the original.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With