Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it undefined behavior to redefine a standard name?

It's easy to reason how such a code would work:

#include <string.h>

#define strcmp my_strcmp

int my_strcmp(const char *, const char *)

...
strcmp(str1, str2);
...

But this question is whether this is technically correct or not.

From C11:

7.1.3.1 (on reserved names):

...

  • Each macro name in any of the following subclauses (including the future library directions) is reserved for use as specified if any of its associated headers is included; unless explicitly stated otherwise (see 7.1.4).
  • All identifiers with external linkage in any of the following subclauses (including the future library directions) and errno are always reserved for use as identifiers with external linkage.184)
  • Each identifier with file scope listed in any of the following subclauses (including the future library directions) is reserved for use as a macro name and as an identifier with file scope in the same name space if any of its associated headers is included.

184 The list of reserved identifiers with external linkage includes math_errhandling, setjmp, va_copy, and va_end.

So this means that strcmp is a reserved word because string.h is included.

7.1.3.2:

... If the program declares or defines an identifier in a context in which it is reserved (other than as allowed by 7.1.4), or defines a reserved identifier as a macro name, the behavior is undefined.

Now this seems to say redefining strcmp is undefined behavior, except it is somehow allowed in 7.1.4.

The possibly relevant contents of 7.1.4 are:

7.1.4.1:

... Any function declared in a header may be additionally implemented as a function-like macro defined in the header, so if a library function is declared explicitly when its header is included, one of the techniques shown below can be used to ensure the declaration is not affected by such a macro. Any macro definition of a function can be suppressed locally by enclosing the name of the function in parentheses, because the name is then not followed by the left parenthesis that indicates expansion of a macro function name. For the same syntactic reason, it is permitted to take the address of a library function even if it is also defined as a macro.185) The use of #undef to remove any macro definition will also ensure that an actual function is referred to. ...

185 This means that an implementation shall provide an actual function for each library function, even if it also provides a macro for that function.

7.1.4.2:

Provided that a library function can be declared without reference to any type defined in a header, it is also permissible to declare the function and use it without including its associated header.

The rest of the clauses are irrelevant. I don't see what 7.1.3.2 refers to as "as allowed by 7.1.4", except the definition of the library function in the same header as the function, i.e. the standard header, as a macro.

In summary, is the code above techincally undefined behavior? How about if string.h was not included?

like image 920
Shahbaz Avatar asked Feb 08 '13 10:02

Shahbaz


2 Answers

At least one reason why it's UB is that string.h can introduce macros. For internal implementation reasons, those macros might have been written on the assumption that strcmp is the "real" strcmp function. If you define strcmp to be something else and then use those macros, strcmp will expand to my_strcmp in the macros, with unexpected consequences.

Rather than try to work out exactly what code would be OK in the ... and what would not, the standard puts an early stop to your shenanigans.

Also note that aside from the fact that the standard flat-out forbids it, your #define strcmp my_strcmp might be a macro re-definition, because string.h is permitted to do #define strcmp __strcmp or whatever. So on some conforming implementations your code is ill-formed.

like image 119
Steve Jessop Avatar answered Sep 20 '22 14:09

Steve Jessop


A program that declares or defines reserved identifiers is not strictly conforming (C 2011 4 5) but may be conforming (C 2011 4 7).

The dispute out of which this question arose was not about whether declaring or defining a reserved identifier is behavior that is undefined by C but whether the behavior may be defined by other means, such as documentation for a specific C implementation, and whether or not a program author could do it.

Some people treat “undefined behavior” as meaning “You may not do this.” This is an incorrect interpretation of “undefined behavior.” Undefined behavior is not something the standard requires you to avoid; it is something the C standard does not help you with.

The C standard explicitly states that it imposes no requirements on undefined behavior. In particular, this means there is no requirement that you may not do it and no requirement that another specification may not define the behavior. Nearly every practical program uses behavior that the C standard does not define, when it makes system calls defined by the operating system documentation or library calls defined by the library documentation or relies on the formats of data types defined by the specific C implementation it is designed for.

In C, “undefined behavior” is merely the end of the rules set by the C standard. It is an open field that you can navigate using other means and not a wall that blocks your progress.

like image 34
Eric Postpischil Avatar answered Sep 20 '22 14:09

Eric Postpischil