Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does char* cause undefined behaviour while char[] doesn't?

Attempting to modify a string literal causes undefined behavior:

char * p = "wikipedia"; 
p[0] = 'W'; // undefined behaviour

One way to prevent this is defining it as an array instead of a pointer:

char p[] = "wikipedia"; 
p[0] = 'W'; // ok

Why does char* cause undefined behaviour, while char[] doesn't?

like image 587
Terry Li Avatar asked Nov 28 '11 21:11

Terry Li


People also ask

What causes undefined behavior in C?

So, in C/C++ programming, undefined behavior means when the program fails to compile, or it may execute incorrectly, either crashes or generates incorrect results, or when it may fortuitously do exactly what the programmer intended.

Can char * point to a string?

In C, a string is an array of characters that end in a null (0). So a char * doesn't contain a string, but it can be the address of or pointer to a string.

Is char * a string?

char* is a pointer to a character. char is a character. A string is not a character. A string is a sequence of characters.

What is undefined Behaviour in C?

In computer programming, undefined behavior (UB) is the result of executing a program whose behavior is prescribed to be unpredictable, in the language specification to which the computer code adheres.


1 Answers

Any attempt to modify a C string literal has undefined behaviour. A compiler may arrange for string literals to be stored in read-only memory (protected by the OS, not literally ROM unless you're on an embedded system). But the language doesn't require this; it's up to you as a programmer to get it right.

A sufficiently clever compiler could have warned you that you should have declared the pointer as:

const char * p = "wikimedia";

though the declaration without the const is legal in C (for the sake of not breaking old code). But with or without a compiler warning, the const is a very good idea.

(In C++, the rules are different; C++ string literals, unlike C string literals, really are const.)

When you initialize an array with a literal, the literal itself still exists in a potentially read-only region of your program image, but it is copied into the local array:

char s[] = "wikimedia"; /* initializes the array with the bytes from the string */
char t[] = { 'w', 'i', ... 'a', 0 };  /* same thing */

Note that char u[] = *p does not work -- arrays can only be initialized from a brace initializer, and char arrays additionally from a string literal.

like image 145
Kerrek SB Avatar answered Nov 15 '22 18:11

Kerrek SB