Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does gcc allow extern declarations of type void (non-pointer)?

Tags:

c

gcc

Why does gcc allow extern declarations of type void? Is this an extension or standard C? Are there acceptable uses for this?

I am guessing it is an extension, but I don't find it mentioned at:
http://gcc.gnu.org/onlinedocs/gcc-4.3.6/gcc/C-Extensions.html

$ cat extern_void.c
extern void foo; /* ok in gcc 4.3, not ok in Visual Studio 2008 */
void* get_foo_ptr(void) { return &foo; }

$ gcc -c extern_void.c # no compile error

$ gcc --version | head -n 1
gcc (Debian 4.3.2-1.1) 4.3.2

Defining foo as type void is of course a compile error:

$ gcc -c -Dextern= extern_void.c
extern_void.c:1: error: storage size of ‘foo’ isn’t known

For comparison, Visual Studio 2008 gives an error on the extern declaration:

$ cl /c extern_void.c 
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

extern_void.c
extern_void.c(1) : error C2182: 'foo' : illegal use of type 'void'
like image 988
David T. Pierson Avatar asked Apr 06 '12 03:04

David T. Pierson


4 Answers

Strangely enough (or perhaps not so strangely...) it looks to me like gcc is correct to accept this.

If this was declared static instead of extern, then it would have internal linkage, and §6.9.2/3 would apply:

If the declaration of an identifier for an object is a tentative definition and has internal linkage, the declared type shall not be an incomplete type.

If it didn't specify any storage class (extern, in this case), then §6.7/7 would apply:

If an identifier for an object is declared with no linkage, the type for the object shall be complete by the end of its declarator, or by the end of its init-declarator if it has an initializer; in the case of function arguments (including in prototypes), it is the adjusted type (see 6.7.5.3) that is required to be complete.

I either of these cases, void would not work, because (§6.2.5/19):

The void type [...] is an incomplete type that cannot be completed.

None of those applies, however. That seems to leave only the requirements of §6.7.2/2, which seems to allow a declaration of a name with type void:

At least one type specifier shall be given in the declaration specifiers in each declaration, and in the specifier-qualifier list in each struct declaration and type name. Each list of type specifiers shall be one of the following sets (delimited by commas, when there is more than one set on a line); the type specifiers may occur in any order, possibly intermixed with the other declaration specifiers.

  • void
  • char
  • signed char

[ ... more types elided]

I'm not sure that's really intentional -- I suspect the void is really intended for things like derived types (e.g., pointer to void) or the return type from a function, but I can't find anything that directly specifies that restriction.

like image 144
Jerry Coffin Avatar answered Nov 01 '22 13:11

Jerry Coffin


I've found the only legitimate use for declaring

extern void foo;

is when foo is a link symbol (an external symbol defined by the linker) that denotes the address of an object of unspecified type.

This is actually useful because link symbols are often used to communicate the extent of memory; i.e. .text section start address, .text section length, etc.

As such, it is important for the code using these symbols to document their type by casting them to an appropriate value. For instance, if foo is actually the length of a memory region:

uint32_t textLen;

textLen = ( uint32_t )foo;

Or, if foo is the start address of that same memory region:

uint8_t *textStart;

textStart = ( uint8_t * )foo;

The only alternate way to reference a link symbol in "C" that I know of is to declare it as an external array:

extern uint8_t foo[];

I actually prefer the void declaration, as it makes it clear that the linker defined symbol has no intrinsic "type."

like image 39
user2824114 Avatar answered Nov 01 '22 13:11

user2824114


GCC (also, LLVM C frontend) is definitely buggy. Both Comeau and MS seems to report errors though.

The OP's snippet has at least two definite UBs and one red-herring:

From N1570

[UB #1] Missing main in hosted environment:

J2. Undefined Behavior

[...] A program in a hosted environment does not define a function named main using one of the specified forms (5.1.2.2.1).

[UB #2] Even if we ignore the above there still remains the issue of taking the address of a void expression which is explicitly forbidden:

6.3.2.1 Lvalues, arrays, and function designators

1 An lvalue is an expression (with an object type other than void) that potentially designates an object;64)

and:

6.5.3.2 Address and indirection operators

Constraints

1T he operand of the unary & operator shall be either a function designator, the result of a [] or unary * operator, or an lvalue that designates an object that is not a bit-field and is not declared with the register storage-class specifier.

[Note: emphasis on lvalue mine] Also, there is a section in the standard specifically on void:

6.3.2.2 void

1 The (nonexistent) value of a void expression (an expression that has type void) shall not be used in any way, and implicit or explicit conversions (except to void) shall not be applied to such an expression.

A file-scope definition is a primary-expression (6.5). So, is taking the address of the object denoted by foo. BTW, the latter invokes UB. This is thus explicitly ruled out. What remains to be figured out is if removing the extern qualifier makes the above valid or not:

In our case the, for foo as per §6.2.2/5:

5 [...] If the declaration of an identifier for an object has file scope and no storage-class specifier, its linkage is external.

i.e. even if we left out the extern we'd still land in the same problem.

like image 1
dirkgently Avatar answered Nov 01 '22 14:11

dirkgently


One limitation of C's linker-interaction semantics is that it provides no mechanism for allowing numeric link-time constants. In some projects, it may be necessary for static initializers to include numeric values which are not available at compile time but will be available at link time. On some platforms, this may be accomplished by defining somewhere (e.g. in an assembly-language file) a label whose address, if cast to int, would yield the numeric value of interest. An extern definition can then be used within the C file to make the "address" of that thing available as a compile-time constant.

This approach is very much platform-specific (as would be anything using assembly language), but it makes possible some constructs that would be problematic otherwise. A somewhat nasty aspect of it is that if the label is defined in C as a type like unsigned char[], that will convey the impression that the address may be dereferenced or have arithmetic performed upon it. If a compiler will accept void foo;, then (int)&foo will convert the linker-assigned address for foo to an integer using the same pointer-to-integer semantics as would be applicable with any other `void*.

I don't think I've ever used void for that purpose (I've always used extern unsigned char[]) but would think void would be cleaner if something defined it as being a legitimate extension (nothing in the C standard requires that any ability exist anywhere to create a linker symbol which can be used as anything other than one specific non-void type; on platforms where no means would exist to create a linker identifier which a C program could define as extern void, there would be no need for compilers to allow such syntax).

like image 1
supercat Avatar answered Nov 01 '22 12:11

supercat