Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does declaring main as an array compile?

Tags:

c

main

gcc

clang

I saw a snippet of code on CodeGolf that's intended as a compiler bomb, where main is declared as a huge array. I tried the following (non-bomb) version:

int main[1] = { 0 }; 

It seems to compile fine under Clang and with only a warning under GCC:

warning: 'main' is usually a function [-Wmain]

The resulting binary is, of course, garbage.

But why does it compile at all? Is it even allowed by the C specification? The section that I think is relevant says:

5.1.2.2.1 Program startup

The function called at program startup is named main. The implementation declares no prototype for this function. It shall be defined with a return type of int and with no parameters [...] or with two parameters [...] or in some other implementation-defined manner.

Does "some other implementation-defined manner" include a global array? (It seems to me that the spec still refers to a function.)

If not, is it a compiler extension? Or a feature of the toolchains, that serves some other purpose and they decided to make it available through the frontend?

like image 874
Theodoros Chatzigiannakis Avatar asked Jan 13 '16 10:01

Theodoros Chatzigiannakis


People also ask

Why does array size need to be known at compile time?

If you create it as a local variable, and specify a length, then it matters because the compiler needs to know how much space to allocate on the stack for the elements of the array. If you don't specify a size of the array, then it doesn't know how much space to set aside for the array elements.

What is an array in c++?

Arrays in C++ An array is a collection of elements of the same type placed in contiguous memory locations that can be individually referenced by using an index to a unique identifier. Five values of type int can be declared as an array without having to declare five different variables (each with its own identifier).

How to create n number of arrays in c++?

std::vector<object_type> array[4]; for (size_t i=0; i<4; ++i) array[i]. resize(dynamic_size); If you want a variable number of arrays, then you can use a vector of vectors, and actually, the initialization for that is even easier. It doesn't require a loop, you can do it in the constructor.


1 Answers

It's because C allows for "non-hosted" or freestanding environment which doesn't require the main function. This means that the name main is freed for other uses. This is why the language as such allows for such declarations. Most compilers are designed to support both (the difference is mostly how linking is done) and therefore they don't disallow constructs that would be illegal in hosted environment.

The section you refers to in the standard refers to hosted environment, the corresponding for freestanding is:

in a freestanding environment (in which C program execution may take place without any benefit of an operating system), the name and type of the function called at program startup are implementation-defined. Any library facilities available to a freestanding program, other than the minimal set required by clause 4, are implementation-defined.

If you then link it as usual it will go bad since the linker normally has little knowledge about the nature of the symbols (what type it has or even if it's a function or variable). In this case the linker will happily resolve calls to main to the variable named main. If the symbol is not found it will result in link error.

If you're linking it as usual you're basically trying to use the compiler in hosted operation and then not defining main as you're supposed to means undefined behavior as per appendix J.2:

the behavior is undefined in the following circumstances:

  • ...
  • program in a hosted environment does not define a function named main using one of the specified forms (5.1.2.2.1)

The purpose of the freestanding possibility is to be able to use C in environments where (for example) standard libraries or CRT initialization is not given. This means that the code that is run before main is called (that's the CRT initialization that initializes the C runtime) might not provided and you would be expected to provide that yourself (and you may decide to have a main or may decide not to).

like image 61
skyking Avatar answered Oct 15 '22 14:10

skyking