Why does gcc allow extern declarations of type void? Is this an extension or standard C? Are there acceptable uses for this?
I am guessing it is an extension, but I don't find it mentioned at:
http://gcc.gnu.org/onlinedocs/gcc-4.3.6/gcc/C-Extensions.html
$ cat extern_void.c
extern void foo; /* ok in gcc 4.3, not ok in Visual Studio 2008 */
void* get_foo_ptr(void) { return &foo; }
$ gcc -c extern_void.c # no compile error
$ gcc --version | head -n 1
gcc (Debian 4.3.2-1.1) 4.3.2
Defining foo as type void is of course a compile error:
$ gcc -c -Dextern= extern_void.c
extern_void.c:1: error: storage size of ‘foo’ isn’t known
For comparison, Visual Studio 2008 gives an error on the extern declaration:
$ cl /c extern_void.c
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.
extern_void.c
extern_void.c(1) : error C2182: 'foo' : illegal use of type 'void'
Strangely enough (or perhaps not so strangely...) it looks to me like gcc is correct to accept this.
If this was declared static
instead of extern
, then it would have internal linkage, and §6.9.2/3 would apply:
If the declaration of an identifier for an object is a tentative definition and has internal linkage, the declared type shall not be an incomplete type.
If it didn't specify any storage class (extern
, in this case), then §6.7/7 would apply:
If an identifier for an object is declared with no linkage, the type for the object shall be complete by the end of its declarator, or by the end of its init-declarator if it has an initializer; in the case of function arguments (including in prototypes), it is the adjusted type (see 6.7.5.3) that is required to be complete.
I either of these cases, void
would not work, because (§6.2.5/19):
The void type [...] is an incomplete type that cannot be completed.
None of those applies, however. That seems to leave only the requirements of §6.7.2/2, which seems to allow a declaration of a name with type void
:
At least one type specifier shall be given in the declaration specifiers in each declaration, and in the specifier-qualifier list in each struct declaration and type name. Each list of type specifiers shall be one of the following sets (delimited by commas, when there is more than one set on a line); the type specifiers may occur in any order, possibly intermixed with the other declaration specifiers.
- void
- char
- signed char
[ ... more types elided]
I'm not sure that's really intentional -- I suspect the void
is really intended for things like derived types (e.g., pointer to void) or the return type from a function, but I can't find anything that directly specifies that restriction.
I've found the only legitimate use for declaring
extern void foo;
is when foo
is a link symbol (an external symbol defined by the linker) that denotes the address of an object of unspecified type.
This is actually useful because link symbols are often used to communicate the extent of memory; i.e. .text section start address, .text section length, etc.
As such, it is important for the code using these symbols to document their type by casting them to an appropriate value. For instance, if foo
is actually the length of a memory region:
uint32_t textLen;
textLen = ( uint32_t )foo;
Or, if foo
is the start address of that same memory region:
uint8_t *textStart;
textStart = ( uint8_t * )foo;
The only alternate way to reference a link symbol in "C" that I know of is to declare it as an external array:
extern uint8_t foo[];
I actually prefer the void
declaration, as it makes it clear that the linker defined symbol has no intrinsic "type."
GCC (also, LLVM C frontend) is definitely buggy. Both Comeau and MS seems to report errors though.
The OP's snippet has at least two definite UBs and one red-herring:
From N1570
[UB #1] Missing main
in hosted environment:
J2. Undefined Behavior
[...] A program in a hosted environment does not define a function named main using one of the specified forms (5.1.2.2.1).
[UB #2] Even if we ignore the above there still remains the issue of taking the address of a void
expression which is explicitly forbidden:
6.3.2.1 Lvalues, arrays, and function designators
1 An lvalue is an expression (with an object type other than void) that potentially designates an object;64)
and:
6.5.3.2 Address and indirection operators
Constraints
1T he operand of the unary & operator shall be either a function designator, the result of a [] or unary * operator, or an lvalue that designates an object that is not a bit-field and is not declared with the register storage-class specifier.
[Note: emphasis on lvalue mine]
Also, there is a section in the standard specifically on void
:
6.3.2.2 void
1 The (nonexistent) value of a void expression (an expression that has type void) shall not be used in any way, and implicit or explicit conversions (except to void) shall not be applied to such an expression.
A file-scope definition is a primary-expression (6.5). So, is taking the address of the object denoted by foo
. BTW, the latter invokes UB. This is thus explicitly ruled out.
What remains to be figured out is if removing the extern
qualifier makes the above valid or not:
In our case the, for foo
as per §6.2.2/5:
5 [...] If the declaration of an identifier for an object has file scope and no storage-class specifier, its linkage is external.
i.e. even if we left out the extern
we'd still land in the same problem.
One limitation of C's linker-interaction semantics is that it provides no mechanism for allowing numeric link-time constants. In some projects, it may be necessary for static initializers to include numeric values which are not available at compile time but will be available at link time. On some platforms, this may be accomplished by defining somewhere (e.g. in an assembly-language file) a label whose address, if cast to int
, would yield the numeric value of interest. An extern
definition can then be used within the C file to make the "address" of that thing available as a compile-time constant.
This approach is very much platform-specific (as would be anything using assembly language), but it makes possible some constructs that would be problematic otherwise. A somewhat nasty aspect of it is that if the label is defined in C as a type like unsigned char[]
, that will convey the impression that the address may be dereferenced or have arithmetic performed upon it. If a compiler will accept void foo;
, then (int)&foo
will convert the linker-assigned address for foo
to an integer using the same pointer-to-integer semantics as would be applicable with any other `void*.
I don't think I've ever used void
for that purpose (I've always used extern unsigned char[]
) but would think void
would be cleaner if something defined it as being a legitimate extension (nothing in the C standard requires that any ability exist anywhere to create a linker symbol which can be used as anything other than one specific non-void type; on platforms where no means would exist to create a linker identifier which a C program could define as extern void
, there would be no need for compilers to allow such syntax).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With