I found a strange behavior from gcc
with the following code:
#include <stdio.h>
#include <string.h>
int main(void)
{
const char str[] = "This string contains é which is a multi-byte character";
const char search[] = "Ʃ";
char * pos = strstr(str, search);
printf("%s\n", pos);
return 0;
}
The compiler produces a warning:
$ gcc toto.c -std=c99
toto.c: In function ‘main’:
toto.c:8:18: warning: initialization discards ‘const’ qualifier
from pointer target type [enabled by default]
But if I change the content of search
:
const char search[] = "é";
The very same compilation does not throw any warning: why?
Note: I have the exact same behavior if I swap Ʃ
and é
: if the character in search
is not present in str
, I get the warning.
We use the const qualifier to declare a variable as constant. That means that we cannot change the value once the variable has been initialized. Using const has a very big benefit. For example, if you have a constant value of the value of PI, you wouldn't like any part of the program to modify that value.
The const qualifier explicitly declares a data object as something that cannot be changed. Its value is set at initialization. You cannot use const data objects in expressions requiring a modifiable lvalue. For example, a const data object cannot appear on the lefthand side of an assignment statement.
The const keyword can be used as a qualifier when declaring objects, types, or member functions. When qualifying an object, using const means that the object cannot be the target of an assignment, and you cannot call any of its non-const member functions.
The const keyword allows a programmer to tell the compiler that a particular variable should not be modified after the initial assignment in its declaration.
A couple of things are going on here.
gcc
's header files instruct gcc to use its built-in, optimized strstr()
, that the compiler knows what it is. Purely from a language standpoint, strstr()
is just some library function that, in theory, that compiler doesn't know about. But, gcc
actually knows what it is.
gcc's optimized version of strstr()
if the string parameter is a char *
, strstr()
returns a char *
; but if the string parameter is a const char *
, strstr()
returns a const char *
, which makes sense.
So, in your case, strstr()
returns a const char *
, which results in an obvious error, assigning to a non-const char *
.
What also appears to be happening is that, in the second part of your question, gcc
figures out that the string exists, and optimizes the whole thing away; but in that case it should also result in a const char *
to char *
conversion, and a warning. Not sure about this one.
It appears to be a bug in gcc, corrected in a later release.
Here's a small program I've written to illustrate the problem.
#include <stdio.h>
#include <string.h>
int main(void) {
const char message[] = "hello";
#ifdef ASCII_ONLY
const char search_for[] = "h";
#else
const char search_for[] = "Ʃ";
#endif
char *non_const_message = strstr(message, search_for);
if (non_const_message == NULL) {
puts("non_const_message == NULL");
}
else {
puts(non_const_message);
}
}
When I compile this with
gcc -DASCII_ONLY -std=c99 -pedantic-errors c.c -o c
(using gcc 4.8.2 on Linux Mint 17), it compiles with no diagnostic messages and the resulting program prints
hello
(I use -pedantic-errors
because that causes gcc to (attempt to) be a conforming ISO C compiler.)
When I drop the -DASCII_ONLY
option, I get a compile-time error message:
c.c: In function ‘main’:
c.c:11:31: error: initialization discards ‘const’ qualifier from pointer target type
char *non_const_message = strstr(message, search_for);
The strstr
function returns a result of type char*
, not const char*
. It takes two const char*
arguments, and with the right search string it can return the value of its first argument. This means it can silently discard the const
ness of its argument. I consider this to be a flaw in the C standard library, but we're probably stuck with it. Conforming C implementations do not have the option of "fixing" this flaw if they want to remain conforming; they can warn about dangerous uses of strstr
but they can't reject otherwise legal code.
(The flaw could have been avoided by splitting strstr
into two functions with different names, one taking a const char*
and returning a const char*
, and the other taking a char*
and returning a char*
. The 1989 ANSI C committee didn't take the opportunity to do this, either because they didn't think of it or because they didn't want to break existing code. C++ addresses it by having two overloaded versions of strstr
, which was not a possibility for C.)
My first assumption was that gcc is "magically" doing something similar to what C++ does -- but examples that discard const
using only ASCII characters don't cause a diagnostic message. As my test program shows, the problem is triggered by the use of a non-ASCII character in a string literal ("Ʃ"
rather than "h"
).
When I use gcc 4.9.1 (which I installed from source) rather than gcc 4.8.2 (the default version installed on my system), the problem goes away:
$ gcc -DASCII_ONLY -std=c99 -pedantic-errors c.c -o c && ./c
hello
$ gcc -std=c99 -pedantic-errors c.c -o c && ./c
c.c: In function ‘main’:
c.c:11:31: error: initialization discards ‘const’ qualifier from pointer target type
char *non_const_message = strstr(message, search_for);
^
$ gcc-4.9.1 -DASCII_ONLY -std=c99 -pedantic-errors c.c -o c && ./c
hello
$ gcc-4.9.1 -std=c99 -pedantic-errors c.c -o c && ./c
non_const_message == NULL
$
I haven't tracked down the bug further than that, but you could probably find it in a list of gcc bugs fixed between 4.8.2 and 4.9.1
For the code in the question, you can avoid the problem by defining pos
as const char*
rather than char*
. It should be const char*
anyway, since it points to an object that was defined as const
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With