Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why am I able to link without including ctype.h

Tags:

c

gcc

gcc4.9

  • Without #include<ctype.h>, the following program outputs 1 and 0.
  • With the include, it outputs 1 and 1.

I am using TDM-GCC 4.9.2 64-bit. I wonder what the implementation of isdigit is in the first case, and why it is able to link.

#include<stdio.h>
//#include<ctype.h>
int main()
{
    printf("%d %d\n",isdigit(48),isdigit(48.4));
    return 0;
}
like image 610
user1537366 Avatar asked Nov 17 '15 12:11

user1537366


People also ask

Why should we include Ctype H?

h> The ctype. h header file of the C Standard Library declares several functions that are useful for testing and mapping characters. All the functions accepts int as a parameter, whose value must be EOF or representable as an unsigned char.

What does Ctype H do?

h> header file declares a set of functions to classify (and transform) individual characters. For example, isupper() checks whether a character is uppercase or not.

What does Ctype mean?

ctypes is a foreign function library for Python. It provides C compatible data types, and allows calling functions in DLLs or shared libraries. It can be used to wrap these libraries in pure Python.


2 Answers

By default GCC uses the C90 standard (with GNU extensions (reference)) which allows implicit declarations. The problem with your case is that you have two calls to isdigit with two different arguments which might confuse the compiler when it creates the implicit declaration of the function, and it probably selects int isdigit(double) to be on the safe side. That is of course the wrong prototype for the function, which means that when the library function is called at run-time it will be called with wrong arguments and you will have undefined behavior.

When you include the <ctype.h> header file, there is a correct prototype, and then the compiler know that isdigit takes an int argument and can convert the double literal 48.4 to the integer 48 for the call.


As for why it's linking, it's because while these functions may be implemented as macros, that's not a requirement. What is a requirement is that those functions, at least in the C11 standard (I don't have any older version available at the moment), have to be aware of the current locale which will make their implementation as macros much harder, and much easier as normal library functions. And as the standard library is always linked (unless you tell GCC otherwise) the functions will be available.

like image 98
Some programmer dude Avatar answered Sep 30 '22 10:09

Some programmer dude


First of all #include statements don't have anything to do with linking. Remember anything with a # in-front in C is meant for the preprocessor, not the compiler or the linker.

But that said the function has to be linked isn't it?

Let's do the steps in separate steps.

$ gcc -c -Werror --std=c99 st.c 
st.c: In function ‘main’:
st.c:5:22: error: implicit declaration of function ‘isdigit’ [-Werror=implicit-function-declaration]
     printf("%d %d\n",isdigit(48),isdigit(48.4));
                      ^
cc1: all warnings being treated as errors

Well as you see gcc's lint(static analyzer) is in action!

Whatever we will proceed to ignore it...

$ gcc -c  --std=c99 st.c 
st.c: In function ‘main’:
st.c:5:22: warning: implicit declaration of function ‘isdigit’ [-Wimplicit-function-declaration]
     printf("%d %d\n",isdigit(48),isdigit(48.4));

This time only an warning. Now we have a object file at the current directory. Let's inspect it...

$ nm st.o 
                 U isdigit
0000000000000000 T main
                 U printf

As you can see both printf and isdigit is listed as undefined. So the code has to come from somewhere isn't it?

let's proceed to link it ...

$ gcc st.o
$ nm a.out | grep  'printf\|isdigit'
                 U isdigit@@GLIBC_2.2.5
                 U printf@@GLIBC_2.2.5

Well as you can see situation is mildly improved. As isdigit and printf are not helpless loners like they were in the st.o. You could see both of the functions are provided by GLIBC_2.2.5. But where is that GLIBC?

Well let's examine the final executable a bit more...

$ ldd a.out 
        linux-vdso.so.1 =>  (0x00007ffe58d70000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fb66f299000)
        /lib64/ld-linux-x86-64.so.2 (0x000055b26631d000)

AHA...there is that libc . So it turns out, though you have not given any instruction, the linker is linking with 3 libraries by default, one of them is the libc which contains both printf and isdigit.

You can see the default behaviour of the linker by :

$gcc -dumpspec
*link:
%{!r:--build-id} %{!static:--eh-frame-hdr} %{!mandroid|tno-android-ld:%{m16|m32|mx32:;:-m elf_x86_64}                    %{m16|m32:-m elf_i386}                    %{mx32:-m elf32_x86_64}   --hash-style=gnu   --as-needed   %{shared:-shared}   %{!shared:     %{!static:       %{rdynamic:-export-dynamic}       %{m16|m32:-dynamic-linker %{muclibc:/lib/ld-uClibc.so.0;:%{mbionic:/system/bin/linker;:/lib/ld-linux.so.2}}}       %{m16|m32|mx32:;:-dynamic-linker %{muclibc:/lib/ld64-uClibc.so.0;:%{mbionic:/system/bin/linker64;:/lib64/ld-linux-x86-64.so.2}}}       %{mx32:-dynamic-linker %{muclibc:/lib/ldx32-uClibc.so.0;:%{mbionic:/system/bin/linkerx32;:/libx32/ld-linux-x32.so.2}}}}     %{static:-static}};:%{m16|m32|mx32:;:-m elf_x86_64}                    %{m16|m32:-m elf_i386}                    %{mx32:-m elf32_x86_64}   --hash-style=gnu   --as-needed   %{shared:-shared}   %{!shared:     %{!static:       %{rdynamic:-export-dynamic}       %{m16|m32:-dynamic-linker %{muclibc:/lib/ld-uClibc.so.0;:%{mbionic:/system/bin/linker;:/lib/ld-linux.so.2}}}       %{m16|m32|mx32:;:-dynamic-linker %{muclibc:/lib/ld64-uClibc.so.0;:%{mbionic:/system/bin/linker64;:/lib64/ld-linux-x86-64.so.2}}}       %{mx32:-dynamic-linker %{muclibc:/lib/ldx32-uClibc.so.0;:%{mbionic:/system/bin/linkerx32;:/libx32/ld-linux-x32.so.2}}}}     %{static:-static}} %{shared: -Bsymbolic}}

What are the other two libraries?

Well remember when you dug into a.out, both printf and isdigit were still shown as U that means unknown. In other words, there were no memory address associated with these symbols.

In reality this is where the magic lies. These libraries were actually loaded during runtime, not during link time like older systems.

How it's implemented? Well it has a jargon associated with, something like lazy linking. What it does, is when the process calls a function , if there is no memory address(TEXT section), it generates a Trap (Something like a Exception in high level language jargon, when control is handed over to the language engine). The kernel intercepts such Trap and hands it over to the dynamic loader which loads the library and returns the associated memory address to the caller process.

There are multiple theoretical reason, why doing things lazily is better than doing it beforehand. I guess that's a whole new topic, which we will discuss at some other time.

like image 45
Aftnix Avatar answered Sep 30 '22 10:09

Aftnix