According to the C standard, if a program defines or declares a reserved identifier, the behavior is undefined. One category of reserved identifiers is identifiers with external linkage defined in the C standard library.
For example of a program with undefined behavior, consider the following: file1.c defines a variable named time
with external linkage, which conflicts with the time
function from the standard library, declared in time.h.
file1.c:
int time;
int foo( void )
{
return time;
}
file2.c:
#include <time.h>
#include <stdio.h>
extern int foo( void );
int main( void )
{
foo();
printf( "current time = %ld\n", time( NULL ) );
return 0;
}
When the program is compiled and run, a seg fault occurs, because the time
symbol referenced in file2.c gets linked to the time
variable from file1.c, rather than the function in the C library.
$ gcc -c -o file1.o file1.c
$ gcc -c -o file2.o file2.c
$ gcc -o test file1.o file2.o
$ ./test
Segmentation fault (core dumped)
I'm wondering if there is any way for GCC to detect the usage of conflicting, reserved identifiers in user code, at compile or link time. Here's my motivation: I'm working on an application where users can write C extensions to the application, which get compiled and linked to the rest of the application. If the user's C code uses reserved identifiers like the example above, the resulting program can fail in hard-to-predict ways.
One solution which comes to mind is to run something like nm
on the user's object files, and compare the defined symbols against a list of reserved identifiers from the C library. However, I am hoping to find something in GCC which can detect the issue. Does anyone know if that is possible, or have any suggestions?
Identifiers with two initial underscores or an initial underscore followed by an uppercase letter are reserved globally for use by the compiler. Identifiers that begin with a single underscore are reserved as identifiers with file scope in both the ordinary and tag namespaces.
Characters in identifiers The first character in an identifier must be a letter or the _ (underscore) character; however, beginning identifiers with an underscore is considered poor programming style. The compiler distinguishes between uppercase and lowercase letters in identifiers.
Rules for writing identifier An identifier can be composed of letters (both uppercase and lowercase letters), digits and underscore '_' only. The first letter of identifier should be either a letter or an underscore. But, it is discouraged to start an identifier name with an underscore though it is legal.
No special characters, such as a semicolon, period, whitespaces, slash, or comma are permitted to be used in or as an Identifier.
I'm wondering if there is any way for GCC to detect the usage of conflicting, reserved identifiers in user code, at compile or link time.
Detail to @PSkocik good answer.
One way to detect many conflicts is to include all headers files. Compilation times may noticeable increase.
Determine version
#if defined(__STDC__)
# define STANDARD_C89
# if defined(__STDC_VERSION__)
# define STANDARD_C90
# if (__STDC_VERSION__ >= 199409L)
# define STANDARD_C95
# endif
# if (__STDC_VERSION__ >= 199901L)
# define STANDARD_C99
# endif
# if (__STDC_VERSION__ >= 201112L)
# define STANDARD_C11
# endif
# if (__STDC_VERSION__ >= 201710L)
# define STANDARD_C18
# endif
# endif
#endif
Include them, some selectively.
#include <assert.h>
//#include <complex.h>
#include <ctype.h>
#include <errno.h>
//#include <fenv.h>
#include <float.h>
//#include <inttypes.h>
//#include <iso646.h>
#include <limits.h>
#include <locale.h>
#include <math.h>
#include <setjmp.h>
#include <signal.h>
#include <stdarg.h>
//#include <stdalign.h>
//#include <stdatomic.h>
//#include <stdbool.h>
#include <stddef.h>
//#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
//#include <stdnoreturn.h>
#include <string.h>
//#include <tgmath.h>
//#include <threads.h>
#include <time.h>
//#include <uchar.h>
//#include <wchar.h>
//#include <wctype.h>
//////////////////////////////
#ifdef STANDARD_C95
#include <iso646.h>
#include <wchar.h>
#include <wctype.h>
#endif
//////////////////////////////
#ifdef STANDARD_C99
#ifndef __STDC_NO_COMPLEX__
#include <complex.h>
#endif
#include <fenv.h>
#include <inttypes.h>
#include <stdbool.h>
#include <stdint.h>
#include <tgmath.h>
#endif
//////////////////////////////
#ifdef STANDARD_C11
#include <stdalign.h>
#ifndef __STDC_NO_THREADS__
#include <stdatomic.h>
#include <threads.h>
#endif
#include <stdnoreturn.h>
#include <uchar.h>
#endif
I am certain the above needs some refinements and would appreciate advice on that.
To avoid additions to the name space, instead of code like #define STANDARD_C11
, use macro code tests
// #ifdef STANDARD_C11
// ... C11 includes
// #endif
#if defined(__STDC__)
# if defined(__STDC_VERSION__)
# if (__STDC_VERSION__ >= 201112L)
... C11 includes
# endif
# endif
#endif
Although the goal is "According to the C standard ...", additional code may be needed to accommodate popular compiler extensions and slight variations from the standard.
You could grab a libc implementation that you can link statically and with -Wl,--whole-archive
and try and slap it onto your object files.
main.c:
int time=42;
int main(){}
link it with a whole libc:
$ musl-gcc main.c -static -Wl,--whole-archive
If you get a multiple definition error or a type/size/alignment of symbol changed warning, you're clashing with your libc.
/usr/local/bin/ld: /usr/local/musl/lib/libc.a(time.lo): in function `time':
/home/petr/f/proj/bxdeps/musl/src/time/time.c:5: multiple definition of `time'; /tmp/cc3bL3pP.o:(.data+0x0): first defined here
Alternatively (and more robustly) you could preinclude and all-of-C (all-of-posix) header and have the compiler tell you about where you're clashing with it (I'd do it just once in a while, otherwise it's going to somewhat pessimize your build times. (Although even including all of POSIX generally isn't as bad as including even a single C++ header)).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With