Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Function definitions of built-in functions in C

We include header files like stdio.h in our C programs to use the built-in library functions. I once used to think that these header files contained the function definitions of the built-in functions that we may use in our programs. But soon found that it was not so.

When we open these header files (eg. stdio.h), all it has are function prototypes and I could see no function definitions there. I see things like this:

00133 int     _EXFUN(printf, (const char *, ...));
00134 int     _EXFUN(scanf, (const char *, ...));
00135 int     _EXFUN(sscanf, (const char *, const char *, ...));
00136 int     _EXFUN(vfprintf, (FILE *, const char *, __VALIST));
00137 int     _EXFUN(vprintf, (const char *, __VALIST));
00138 int     _EXFUN(vsprintf, (char *, const char *, __VALIST));
00139 int     _EXFUN(vsnprintf, (char *, size_t, const char *, __VALIST));
00140 int     _EXFUN(fgetc, (FILE *));
00141 char *  _EXFUN(fgets, (char *, int, FILE *));
00142 int     _EXFUN(fputc, (int, FILE *));
00143 int     _EXFUN(fputs, (const char *, FILE *));
00144 int     _EXFUN(getc, (FILE *));
00145 int     _EXFUN(getchar, (void));
00146 char *  _EXFUN(gets, (char *));
00147 int     _EXFUN(putc, (int, FILE *));
00148 int     _EXFUN(putchar, (int));
00149 int     _EXFUN(puts, (const char *));`

(source: https://www.gnu.org/software/m68hc11/examples/stdio_8h-source.html)

Then I was told that maybe the functions definitions must be in one of the header files included in the header file that we examine and so I believed for some time. Since then I’ve looked into a lot of header files but never found a single function definition.

I recently read that the function definitions of the built-in functions are not provided directly but is given in some special way. Is this true? If so, where are the function definitions of the built-in functions stored? And how are they brought into our programs since the header files only have their prototypes?

EDIT: Please note that I showed the contents of the header file just as a sample. My question is not about _EXFUN macro.

like image 274
J...S Avatar asked Mar 14 '17 13:03

J...S


1 Answers

A 'prototype' is generally referred to as the function's declaration - this is what you will find in the header files. In this case, the prototype construction is aided by the _EXFUN() macro, and will be fully revealed by preprocessing. The following command will pass stdio.h through the preprocessor and output the result to stdout:

gcc -E -x c /dev/null -include stdio.h

If you wade through the output, you'll find the expected prototypes (used as examples below), my system gives:

extern int printf (const char *__restrict __format, ...);

extern int vfprintf (FILE *__restrict __s, const char *__restrict __format,
       __gnuc_va_list __arg);

I recently read that the function definitions of the built-in functions are not provided directly but is given in some special way. Is this true?

Yes, via libraries. If you're looking for the function's implementation, then you will need to look at the sources for the appropriate function. In this case, stdio.h is owned by a variant of 'The C standard library' - libc, or in my case glibc.

Header files should almost never include implementation details, and should instead just contain definitions for struct, enum, typedef, and function prototypes that need to be shared.

If you're looking for the implementation / source of printf() (as an example), then you will need to look at the library's source code.

It is quite unlikely that your toolchain will ship with the source code, it will probably include the libraries (*.a and *.so), and header files (*.h). Some package managers and libraries have two packages associated with them - for example: mylibrary and mylibrary-dev. In this case, the former will generally contain the library binaries, while the latter will contain the header files so that you may make use of the library in your application - neither package usually contain the sources.

it In my case (as mentioned above), the library is glibc:

  • https://sourceware.org/git/?p=glibc.git;a=summary

As you're interested in printf(), then you'll need to look at stdio-common/printf.c:

  • https://sourceware.org/git/?p=glibc.git;a=blob;f=stdio-common/printf.c;h=ba84064e9e67e87311fc4d7fb0ebc28b7e6a83f6;hb=HEAD#l24

This is of course just a thin wrapper around vfprintf(). It's at this point that you start to realise that some libraries are very large and complex... You can spend quite a bit of time trying to see 'through' macros to find your target function, which happens to be in stdio-common/vfprintf.c:

  • https://sourceware.org/git/?p=glibc.git;a=blob;f=stdio-common/vfprintf.c;h=54b1ba2ac87ece033b0dcf95cbe43bf401b1aa33;hb=HEAD#l1233

And how are they brought into our programs since the header files only have their prototypes?

One of the final steps of 'compiling' an application is 'linking'. There are two types:

Static Linking

The machine code is taken from *.a files - static libraries. These files are just archives (see ar(1)) containing object files (*.o), which in turn contain the machine code.

  • Compile Time: The actual machine code for the specific function is copied into your binary.

  • Run-Time: When your binary is loaded, it already has a copy of the printf() function. Job done.

Dynamic Linking

The machine code is taken from *.so files - static libraries, or 'DLLs' - Dynamic-link library. These files are binaries in their own right containing a set of symbols, or entry points that may be used.

  • Compile Time: The linker will just ensure that the functions that you're calling exist in the shared libraries, and note down that they need to be linked at run-time.

  • Run-Time: When your binary is loaded, it has a list of 'symbols' that need to be linked, and where they can be found. At this point, the dynamic linker (/lib/ld-linux.so.2 for me) is invoked. In simple terms, the dynamic linker will then 'wire up' all of the shared library functions before your application executed. In reality, this can be deferred until the symbol is actually accessed.


As yet another extension... You have to be careful - compilers will often optimise expensive operations away.

The following simple use of printf() will likely get optimised to a call to puts():

#include <stdio.h>

void main(void) {
    printf("Hello World\n");
}

Output of objdump -d ${MY_BINARY}:

[...]

000000000040052d <main>:
  40052d:       55                      push   %rbp
  40052e:       48 89 e5                mov    %rsp,%rbp
  400531:       bf c4 05 40 00          mov    $0x4005c4,%edi
  400536:       e8 d5 fe ff ff          callq  400410 <puts@plt>
  40053b:       5d                      pop    %rbp
  40053c:       c3                      retq
  40053d:       0f 1f 00                nopl   (%rax)

[...]

For further reading, see here: https://www.technovelty.org/linux/plt-and-got-the-key-to-code-sharing-and-dynamic-libraries.html

like image 117
Attie Avatar answered Nov 15 '22 04:11

Attie