Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Will the functions and variables precede with an "_" when compiled using gcc?

I am learning OS development in a Linux environment using GCC. I learnt in Bran's Kernel Development that all the functions and variable names in C when compiled precedes with an "_"(underscore) in its corresponding Assembly source file. But when I went through the assembly source of a compiled C program, I can't even find the "_main" function. I performed the following.

cpp sample.c sample.i

gcc -S sample.I

like image 392
Panther Coder Avatar asked Jan 06 '23 18:01

Panther Coder


2 Answers

That was true in the early days. A given C function foo would show up as _foo in the assembler. This was done to avoid conflicts with hand generated .s files.

It would also be limited to 8 characters total [a linker restriction].

This hasn't been true for decades. Now, symbols are no longer prefixed with _ and can be much longer than 8 characters.


UPDATE:

So, Nowadays GCC does not produce a _ in front of functions and variables?

For the most part, no. IMO, the reference you're citing, on this point, does seem to be a bit dated.

Most POSIX systems (e.g. linux, *BSD) use gcc [or clang] and they leave off the _.

When I first started programming in C [circa 1981], the _ was still being used. This was on AT&T Unix v7, System III, and System V.

IIRC, it was gone by the early 1990s for newer systems (like linux). Personally, I haven't encountered the _ prefix since then, but I've [mostly] used linux [and sometimes cygwin].

Some AT&T Unix derived systems may have kept it around for backward compatibility, but, eventually, most everybody standardized on "foo is foo". I don't have access to OSX, so I can't rule out Johnathan's comment regarding that.

The _ had been around since the early days of Unix (circa 1970). This was before my time, but, IIRC, Unix was originally written in assembler. It was converted to C. The _ was to demarcate functions either written in C, or asm ones that could be called from C functions.

Those that didn't have the prefix were "asm only" [as they may have used non-standard calling conventions]. Back in the day, everything was precious: RAM, CPU cycles, etc.

So, asm functions could/would use "tricks" to conserve resources. Several asm functions could work as a group because they knew about one another.

If a given asm function could be called from C, the _ prefixed symbol was the C compatible "wrapper" for it [that did extra save/restore in the prolog/epilog].

So, I can just call the main function of a C program as "call main" instead of "call _main"?

That's a reasonably safe bet.

If you're calling a given function from C, it will automatically do the right thing (i.e. add prefix or not).

It's only when trying to call a C function from hand generated assembler that the issue might even come up.

So, for asm, I'd just do the simple thing and do call main. It will work on most [if not all] systems.

If you wanted to "bullet proof" your code, you could run your asm through the C preprocessor (via a .S file) and do (e.g.):

#ifdef C_USES_UNDERSCORE
#define CF(_x)          _##_x
#else
#define CF(_x)          _x
#endif

    call    CF(main)

But, I think that's overkill.

It also illustrates the whole problem with the _ prefix thing. On a modern system [with lots of memory and CPU cycles], why should an assembler function have to know whether an ABI compatible function it is calling was generated from C or hand written assembler?

like image 195
Craig Estey Avatar answered Jan 23 '23 05:01

Craig Estey


As detailed by Craig, it's a convention that modern formats/ABIs like COFF and ELF don't follow anymore.

On some targets, that use different ABIs, it's still in use. Examples are NeXT/OS X's Mach-O or 16- and 32-bit Windows. 64-bit Windows doesn't use the underscore anymore (although GCC continued doing so for a time, till 4.5.1 specifically).

Additionally, the underscore might appear as part of a bigger prefix. For example __imp_ in __declspec(dllimport) symbols or _Z in the Itanium ABI.

If you for some reason, need to influence the mangling, GCC provides a -f[no]leading-underscore flag. This will break ABI-compatiblity.


Some links:

like image 35
a3f Avatar answered Jan 23 '23 05:01

a3f