why we must recompile a c source code for a different os on the same machine?

Question

When I compile my c source code (for example in a Linux environment) the compiler generates a file in a "machine readable" format.

Why the same file is not working on the same machine under a different operating system?
Is the problem about the way we "execute" this file?

mtijanic · Accepted Answer

Sometimes it will work, depending on the format and the libraries that you use, etc.. For example, things like allocating memory or creating a window all call the OS functions. So you have to compile for the target OS, with those libraries linked in (statically or dynamically).

However, the instructions themselves are the same. So, if your program doesn't use any of the OS functions (no standard or any other library), you could run it on another OS. The second thing that is problematic here is executable formats.. Windows .exe is very different from for example ELF. However, a flat format that just has the instructions (such as .com) would work on all systems.

EDIT: A fun experiment would be to compile some functions to a flat format (just the instructions) on one OS (e.g. Windows). For example:

int add(int x, int y) { return x + y; }

Save just the instructions to a file, without any relocation or other staging info. Then, on a different OS (e.g. Linux) compile a full program that will do something like this:

typedef int (*PFUNC)(int, int); // pointer to a function like our add one

PFUNC p = malloc(200); // make sure you have enough space.
FILE *f = fopen("add.com", "rb");
fread(p, 200, 1, f); // Load the file contents into p
fclose(f);
int ten = p(4, 6);

For this to work, you'd also need to tell the OS/Compiler that you want to be able to execute allocated memory, which I'm not sure how to do, but I know can be done.

ach · Answer

I have been asked what is an ABI discrepancy. I think it's best to explain over a simple example.

Consider a little silly function:

int f(int a, int b, int (*g)(int, int))
{
    return g(a * 2, b * 3) * 4;
}

Compile it for x64/Windows and for x64/Linux.

For x64/Windows the compiler emits something like:

f:
sub         rsp,28h
lea         edx,[rdx+rdx*2]
add         ecx,ecx
call        r8
shl         eax,2
add         rsp,28h
ret

For x64/Linux, something like:

f:
sub    $0x8,%rsp
lea    (%rsi,%rsi,2),%esi
add    %edi,%edi
callq  *%rdx
add    $0x8,%rsp
shl    $0x2,%eax
retq

Allowing for different traditional notations of assembly language on Windows and Linux, there obviously are substantial differences in the code.

The Windows version clearly expects a to arrive in ECX (lower half of the RCX register), b in EDX (lower half of the RDX register), and g in the R8 register. This is mandated by the x64/Windows calling convention, which is a part of the ABI (application binary interface). The code also prepares arguments to g in ECX and EDX.

The Linux version expects a in EDI (the lower half of the RDI register), b in ESI (the lower half of the RSI register), and g in the RDX register. This is mandated by the calling convention of System V AMD64 ABI (used on Linux and other Unix-like operating systems on x64). The code prepares arguments to g in EDI and ESI.

Now imagine that we run a Windows program which somehow extracts the body of f from a Linux-targeted module and calls it:

int g(int a, int b);

typedef int (*G)(int, int);
typedef int (*F)(int, int, G);

F f = (F) load_linux_module_and_get_symbol("module.so", "f");
int result = f(3, 4, &g);

What is going to happen? Since on Windows functions expect their arguments in ECX, EDX and R8, the compiler will place actual arguments in those registers:

mov         edx,4
lea         r8,[g]
lea         ecx,[rdx-1]
call        qword ptr [f1]

But the Linux-targeted version of f looks for values elsewhere. In particular, it is looking for the address of g in RDX. We have just initialized its lower half to 4, so there are practically nil chances that RDX will contain anything making sense. The program will most likely crash.

Running Windows-targeted code on a Linux system will produce the same effect.

Thus, we cannot run 'foreign' code but with a thunk. A thunk is a piece of low-level code which rearranges arguments to allow calls between pieces of code following different sets of rules. (Thunks may probably do something else because the effects of ABI may not be limited by the calling convention.) You typically cannot write a thunk in high-level programming language.

Note that in our scenario we need to provide thunks for both f ('host-to-foreign') and g ('foreign-to-host').

why we must recompile a c source code for a different os on the same machine?

Tags:

c

compilation

alejho

2 Answers

mtijanic

ach

Recent Activity

Donate For Us

why we must recompile a c source code for a different os on the same machine?

Tags:

c

compilation

alejho

2 Answers

mtijanic

ach

Related questions

Recent Activity

Donate For Us