Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How CRT calls main , having different parameter

Tags:

c++

c

We can write main function in several ways,

  1. int main()
  2. int main(int argc,char *argv[])
  3. int main(int argc,char *argv[],char * environment)

How run-time CRT function knows which main should be called. Please notice here, I am not asking about Unicode supported or not.

like image 488
Pranit Kothari Avatar asked Apr 29 '12 12:04

Pranit Kothari


3 Answers

The accepted answer is incorrect, there's no special code in the CRT to recognize the kind of main() declaration.

It works because of the cdecl calling convention. Which specifies that arguments are pushed on the stack from right to left and that the caller cleans up the stack after the call. So the CRT simply passes all arguments to main() and pops them again when main() returns. The only thing you need to do is specify the arguments in the right order in your main() function declaration. The argc parameter has to be first, it is the one on the top of the stack. argv has to be second, etcetera. Omitting an argument makes no difference, as long as you omit all the ones that follow as well.

This is also why the printf() function can work, it has a variable number of arguments. With one argument in a known position, the first one.

like image 169
Hans Passant Avatar answered Nov 19 '22 06:11

Hans Passant


In general, the compiler/linker would need to recognise the particular form of main that you are using and then include code to adapt that from the system startup function to your C or C++ main function.

It is true that specific compilers on specific platforms could get away without doing this, using the methods that Hans describes in his answer. However, not all platforms use the stack to pass parameters, and it is possible to write conforming C and C++ implementations which have incompatible parameter lists. For such cases, then the compiler/linker would need to determine which form of main to call.

like image 25
David Heffernan Avatar answered Nov 19 '22 06:11

David Heffernan


Hmmm. It seems that perhaps the currently accepted answer, which indicates that the previously accepted answer is incorrect, is itself incorrect. The tags on this question indicate it applies to C++ as well as C, so I’ll stick to the C++ spec, not C99. Regardless of all other explanations or arguments, the primary answer to this question is that “main() is treated special in an implementation-defined way.” I believe that David's answer is technically more correct than Hans', but I'll explain it in more detail....

The main() function is a funny one, treated by the compiler & linker with behavior that matches no other function. Hans is correct that there is no special code in the CRT to recognize different signatures of main(), but his assertion that it “works because of the cdecl calling convention” applies only to specific platform(s), notably Visual Studio. The real reason that there’s no special code in the CRT to recognize different signatures of main() is that there’s no need to. And though it’s sort of splitting hairs, it’s the linker whose job it is to tie the startup code into main() at link time, it’s not the CRT’s job at startup time.

Much of how the main() function is treated is implementation-defined, as per the C++ spec (see Section 3.6, “Start and termination”). It’s likely that most implementations’ compilers treat main() implicitly with something akin to extern “C” linkage, leaving main() in a non-decorated state so that regardless of its function prototype, its linker symbol is the same. Alternatively, the linker for an implementation could be smart enough to scan through the symbol table looking for any whose decorated name resolves to some form of “[int|void] main(...)” (note that void as a return type is itself an implementation-specific thing, as the spec itself says that the return type of main() must be ‘int’). Once such a function is found in the available symbols, the linker could simply use that where the startup code refers to “main()”, so the exact symbol name doesn’t necessarily have to match anything in particular; it could even be wmain() or other, as long as either the linker knows what variations to look for, or the compiler endows all of the variations with the same symbol name.

Also key to note is that the spec says that main() may not be overloaded, so the linker shouldn’t have to “pick” between multiple user implementations of various forms of main(). If it finds more than one, that’s a duplicate symbol error (or other similar error) even if the argument lists don’t match. And though all implementations “shall” allow both

int main() { /* ... */ }

and

int main(int argc, char* argv[]) { /* ... */ }

they are also permitted to allow other argument lists, including the version you show that includes an environment string array pointer, and any other variation that makes sense in any given implementation.

As Hans indicates, the Visual Studio compiler’s cdecl calling convention (and calling conventions of many other compilers) provide a framework wherein a caller can set up the calling environment (i.e. the stack, or ABI-defined registers, or some combination of the two) in such a way that a variable number of arguments can be passed, and when the callee returns, the caller is responsible for cleanup (popping the used argument space off the stack, or in the case of registers, nothing needs done for cleanup). This setup lends itself neatly to the startup code passing more parameters than might be needed, and the user’s main() implementation is free to use or not use any of these arguments, as is the case with many platforms’ treatment of the various forms of main() you list in your question. However, this is not the only way a compiler+linker could accomplish this goal: Instead, the linker could choose between various versions of the startup code based on the definition of your main(). Doing so would allow a wide variety of main() argument lists that would otherwise be impossible with the cdecl caller-cleanup model. And since all of that is implementation-defined, it’s legal per the C++ spec, as long as the compiler+linker supports at least the two combinations shown above (int main() and int main(int, char**)).

like image 3
phonetagger Avatar answered Nov 19 '22 07:11

phonetagger