I am trying to compile a source with tcc (ver 0.9.26) against a gcc-generated .o file, but it has strange behavior. The gcc (ver 5.3.0)is from MinGW 64 bit.
More specifically, I have the following two files (te1.c te2.c). I did the following commands on windows7 box
c:\tcc> gcc -c te1.c
c:\tcc> objcopy -O elf64-x86-64 te1.o #this is needed because te1.o from previous step is in COFF format, tcc only understand ELF format
c:\tcc> tcc te2.c te1.o
c:\tcc> te2.exe
567in dummy!!!
Note that it cut off 4 bytes from the string 1234567in dummy!!!\n
. Wonder if what could have gone wrong.
Thanks Jin
========file te1.c===========
#include <stdio.h>
void dummy () {
printf1("1234567in dummy!!!\n");
}
========file te2.c===========
#include <stdio.h>
void printf1(char *p) {
printf("%s\n",p);
}
extern void dummy();
int main(int argc, char *argv[]) {
dummy();
return 0;
}
Update 1
Saw a difference in assembly between te1.o (te1.c compiled by tcc) and te1_gcc.o (te1.c compiled by gcc). In the tcc compiled, I saw lea -0x4(%rip),%rcx
, on the gcc compiled, I saw lea 0x0(%rip),%rcx
.
Not sure why.
C:\temp>objdump -d te1.o
te1.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <dummy>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 48 81 ec 20 00 00 00 sub $0x20,%rsp
b: 48 8d 0d fc ff ff ff lea -0x4(%rip),%rcx # e <dummy+0xe>
12: e8 fc ff ff ff callq 13 <dummy+0x13>
17: c9 leaveq
18: c3 retq
19: 00 00 add %al,(%rax)
1b: 00 01 add %al,(%rcx)
1d: 04 02 add $0x2,%al
1f: 05 04 03 01 50 add $0x50010304,%eax
C:\temp>objdump -d te1_gcc.o
te1_gcc.o: file format pe-x86-64
Disassembly of section .text:
0000000000000000 <dummy>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 48 83 ec 20 sub $0x20,%rsp
8: 48 8d 0d 00 00 00 00 lea 0x0(%rip),%rcx # f <dummy+0xf>
f: e8 00 00 00 00 callq 14 <dummy+0x14>
14: 90 nop
15: 48 83 c4 20 add $0x20,%rsp
19: 5d pop %rbp
1a: c3 retq
1b: 90 nop
1c: 90 nop
1d: 90 nop
1e: 90 nop
1f: 90 nop
Update2
Using a binary editor, I changed the machine code in te1.o (produced by gcc) and changed lea 0(%rip),%rcx
to lea -0x4(%rip),%rcx
and using the tcc to link it, the resulted exe works fine.
More precisely, I did
c:\tcc> gcc -c te1.c
c:\tcc> objcopy -O elf64-x86-64 te1.o
c:\tcc> use a binary editor to the change the bytes from (48 8d 0d 00 00 00 00) to (48 8d 0d fc ff ff ff)
c:\tcc> tcc te2.c te1.o
c:\tcc> te2
1234567in dummy!!!
Update 3
As requested, here is the output of objdump -r te1.o
C:\temp>gcc -c te1.c
C:\temp>objdump -r te1.o
te1.o: file format pe-x86-64
RELOCATION RECORDS FOR [.text]:
OFFSET TYPE VALUE
000000000000000b R_X86_64_PC32 .rdata
0000000000000010 R_X86_64_PC32 printf1
RELOCATION RECORDS FOR [.pdata]:
OFFSET TYPE VALUE
0000000000000000 rva32 .text
0000000000000004 rva32 .text
0000000000000008 rva32 .xdata
C:\temp>objdump -d te1.o
te1.o: file format pe-x86-64
Disassembly of section .text:
0000000000000000 <dummy>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 48 83 ec 20 sub $0x20,%rsp
8: 48 8d 0d 00 00 00 00 lea 0x0(%rip),%rcx # f <dummy+0xf>
f: e8 00 00 00 00 callq 14 <dummy+0x14>
14: 90 nop
15: 48 83 c4 20 add $0x20,%rsp
19: 5d pop %rbp
1a: c3 retq
1b: 90 nop
1c: 90 nop
1d: 90 nop
1e: 90 nop
1f: 90 nop
GCC stands for GNU Compiler Collections which is used to compile mainly C and C++ language. It can also be used to compile Objective C and Objective C++.
Linking is performed when the input file are object files " .o " (instead of source file " . cpp " or " . c "). GCC uses a separate linker program (called ld.exe ) to perform the linking.
If you're a hacker running Windows, you don't need a proprietary application to compile code. With the Minimalist GNU for Windows (MinGW) project, you can download and install the GNU Compiler Collection (GCC) along with several other essential GNU components to enable GNU Autotools on your Windows computer.
Has nothing to do with tcc
or calling conventions. It has to do with different linker conventions for elf64-x86-64 and pe-x86-64
formats.
With PE, the linker will subtract 4 implicitly to calculate the final offset.
With ELF, it does not do this. Because of this, 0 is the correct initial value for PE, and -4 is correct for ELF.
Unfortunately, objcopy
does not convert this -> bug in objcopy
.
add
extern void printf1(char *p);
to your te1.c file
Or: the compiler will assume argument 32 bit integer since there's no prototype, and pointers are 64-bit long.
Edit: this is still not working. I found out that the function never returns (since calling the printf1 a second time does nothing!). Seems that the 4 first bytes are consumed as return address or something like that. In gcc 32-bit mode it works fine.
Sounds like a calling convention problem to me but still cannot figure it out.
Another clue: calling printf
from te1.c
side (gcc, using tcc stdlib bindings) crashes with segv.
I disassembled the executable. First part is repeated call from tcc side
40104f: 48 8d 05 b3 0f 00 00 lea 0xfb3(%rip),%rax # 0x402009
401056: 48 89 45 f8 mov %rax,-0x8(%rbp)
40105a: 48 8b 4d f8 mov -0x8(%rbp),%rcx
40105e: e8 9d ff ff ff callq 0x401000
401063: 48 8b 4d f8 mov -0x8(%rbp),%rcx
401067: e8 94 ff ff ff callq 0x401000
40106c: 48 8b 4d f8 mov -0x8(%rbp),%rcx
401070: e8 8b ff ff ff callq 0x401000
401075: 48 8b 4d f8 mov -0x8(%rbp),%rcx
401079: e8 82 ff ff ff callq 0x401000
40107e: e8 0d 00 00 00 callq 0x401090
401083: b8 00 00 00 00 mov $0x0,%eax
401088: e9 00 00 00 00 jmpq 0x40108d
40108d: c9 leaveq
40108e: c3 retq
Second part is repeated (6 times) call to the same function. As you can see the address is different (shifted by 4 bytes, like your data) !!! It kind of works just once because the 4 first instructions are the following:
401000: 55 push %rbp
401001: 48 89 e5 mov %rsp,%rbp
so stack is destroyed if those are skipped!!
40109f: 48 89 45 f8 mov %rax,-0x8(%rbp)
4010a3: 48 8b 45 f8 mov -0x8(%rbp),%rax
4010a7: 48 89 c1 mov %rax,%rcx
4010aa: e8 55 ff ff ff callq 0x401004
4010af: 48 8b 45 f8 mov -0x8(%rbp),%rax
4010b3: 48 89 c1 mov %rax,%rcx
4010b6: e8 49 ff ff ff callq 0x401004
4010bb: 48 8b 45 f8 mov -0x8(%rbp),%rax
4010bf: 48 89 c1 mov %rax,%rcx
4010c2: e8 3d ff ff ff callq 0x401004
4010c7: 48 8b 45 f8 mov -0x8(%rbp),%rax
4010cb: 48 89 c1 mov %rax,%rcx
4010ce: e8 31 ff ff ff callq 0x401004
4010d3: 48 8b 45 f8 mov -0x8(%rbp),%rax
4010d7: 48 89 c1 mov %rax,%rcx
4010da: e8 25 ff ff ff callq 0x401004
4010df: 48 8b 45 f8 mov -0x8(%rbp),%rax
4010e3: 48 89 c1 mov %rax,%rcx
4010e6: e8 19 ff ff ff callq 0x401004
4010eb: 90 nop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With