Context: In a recent conversation, the question "does gcc/clang do strlen("static string")
at compile time?" came up. After some testing, the answer seems to be yes, regardless the level of optimization. I was a bit surprised to see this done even at -O0
, so I did some testing, and eventually arrived to the following code:
#include <stdio.h>
unsigned long strlen(const char* s) {
return 10;
}
unsigned long f() {
return strlen("abcd");
}
unsigned long g(const char* s) {
return strlen(s);
}
int main() {
printf("%ld %ld\n",f(),g("abcd"));
return 0;
}
To my surprise, it prints 4 10
and not 10 10
. I tried compiling with gcc
and clang
, and with various flags (-pedantic
, -O0
, -O3
, -std=c89
, -std=c11
, ...) and the behavior is consistent between the tests.
Since I didn't include string.h
, I expected my definition of strlen
to be used. But the assembly code shows indeed that strlen("abcd")
was basically replaced by return 4
(which is what I'm observing when running the program).
Also, the compilers print no warnings with -Wall -Wextra
(more precisely, none related to the issue: they still warn that parameter s
is unused in my definition of strlen
).
Two (related) questions arise (I think they are related enough to be asked in the same question):
- is it allowed to redefine a standard function in C when the header declaring it isn't included?
- does this program behave as it should? If so, what happens exactly?
"A redefined function is a method in a descendant class that has a different definition than a non-virtual function in an ancestor class.
It is indeed possible to redefine a function from its body. The technique is used in the so-called Lazy Function Definition Pattern. This function stores the Date of the first call, and returns this Date afterwards.
C Standard library functions or simply C Library functions are inbuilt functions in C programming. The prototype and data definitions of these functions are present in their respective header files. To use these functions we need to include the header file in our program.
Per C 2011 (draft N1570) 7.1.3 1 and 2:
All identifiers with external linkage in any of the following subclauses … are always reserved for use as identifiers with external linkage.
If the program declares or defines an identifier in a context in which it is reserved (other than as allowed by 7.1.4), or defines a reserved identifier as a macro name, the behavior is undefined.
The “following subclauses” specify the standard C library, including strlen
. Your program defines strlen
, so its behavior is undefined.
What is happening in the case you observe is:
strlen
is supposed to behave, regardless of your definition, so, while optimizing strlen("abcd")
in f
, it evaluates strlen
at compile time, resulting in four.g("abcd")
, the compiler fails to recognize that, because of the definition of g
, this is equivalent to strlen("abcd")
, so it does not optimize it at compile time. Instead, it compiles it to a call to g
, and it compiles g
to call strlen
, and it also compiles your definition of strlen
, with the result that g("abcd")
calls g
, which calls your strlen
, which returns ten.The C standard would allow the compiler to discard your definition of strlen
completely, so that g
returned four. However, a good compiler should warn that your program defines a reserved identifier.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With