Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Enum storage differences in C vs. C++

Tags:

c++

c

enums

I came across the following construct in a C project that I port to C++;

enum TestEnum 
{
    A=303,
    B=808
} _TestEnum;

int foo()
{
  _TestEnum = B;
}

When compiling with GCC and taking a look at the generated code I get:

nils@doofnase ~ $ gcc -std=c90 -O2 -c ./test.c -o test.o
nils@doofnase ~ $ size test.o
   text    data     bss     dec     hex filename
     59       0       0      59      3b test.o

So zero bytes of data or BSS segments used.

On the other hand if I compile in C++ I get:

nils@doofnase ~ $ g++ -std=c++11 -O2 -c ./test.c -o test.o
nils@doofnase ~ $ size test.o
   text    data     bss     dec     hex filename
     59       0       4      63      3f test.o

I see four byte storage allocated in BSS as I would expect.

Also, in the C project the enum definition is actually located in a header-file which gets included in multiple c files. The project compiles and links just fine. When compiled and linked as C++ the compiler complains that _TestEnum is defined in multiple objects (right so!).

What is going on here? Am I looking at some archaic C language special case?

Edit: For completes sake, this is the gcc version:

nils@doofnase ~ $ gcc --version
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.5) 5.4.0 20160609
like image 790
Nils Pipenbrinck Avatar asked Feb 05 '23 02:02

Nils Pipenbrinck


1 Answers

By default, the GCC compiler enables a C extension (even when -pedantic flag is in effect), which allows multiple external definitions of objects across translation units.

Referring to C11 (N1570) J.5.11 Multiple external definitions (informative section):

There may be more than one external definition for the identifier of an object, with or without the explicit use of the keyword extern; if the definitions disagree, or more than one is initialized, the behavior is undefined (6.9.2).

Note that an application, that relies on this behavior is not strictly conformant with the ISO C language. More specifically, C11 6.9/p5 External definitions states (emphasis mine):

An external definition is an external declaration that is also a definition of a function (other than an inline definition) or an object. If an identifier declared with external linkage is used in an expression (other than as part of the operand of a sizeof or _Alignof operator whose result is an integer constant), somewhere in the entire program there shall be exactly one external definition for the identifier; otherwise, there shall be no more than one.161)

Technically, a violation of that rule invokes an undefined behavior, which means that an implementation may or may not issue an diagnostic message.

You may inspect, that this extension has been enabled by nm command:

nm test.o 
0000000000000000 T foo
0000000000000004 C _TestEnum

According to man nm:

"C" The symbol is common. Common symbols are uninitialized data. When linking, multiple common symbols may appear with the same name. If the symbol is defined anywhere, the common symbols are treated as undefined references.

In order to disable this extension, you may use -fno-common flag. From GCC documentation:

Unix C compilers have traditionally allocated storage for uninitialized global variables in a common block. This allows the linker to resolve all tentative definitions of the same variable in different compilation units to the same object, or to a non-tentative definition. This is the behavior specified by -fcommon, and is the default for GCC on most targets.

like image 85
Grzegorz Szpetkowski Avatar answered Feb 20 '23 17:02

Grzegorz Szpetkowski