For the following C code:
struct _AStruct {
int a;
int b;
float c;
float d;
int e;
};
typedef struct _AStruct AStruct;
AStruct test_callee5();
void test_caller5();
void test_caller5() {
AStruct g = test_callee5();
AStruct h = test_callee5();
}
I get the following disassembly for Win32:
_test_caller5:
00000000: lea eax,[esp-14h]
00000004: sub esp,14h
00000007: push eax
00000008: call _test_callee5
0000000D: lea ecx,[esp+4]
00000011: push ecx
00000012: call _test_callee5
00000017: add esp,1Ch
0000001A: ret
And for Linux32:
00000000 <test_caller5>:
0: push %ebp
1: mov %esp,%ebp
3: sub $0x38,%esp
6: lea 0xffffffec(%ebp),%eax
9: mov %eax,(%esp)
c: call d <test_caller5+0xd>
11: sub $0x4,%esp ;;;;;;;;;; Note this extra sub ;;;;;;;;;;;;
14: lea 0xffffffd8(%ebp),%eax
17: mov %eax,(%esp)
1a: call 1b <test_caller5+0x1b>
1f: sub $0x4,%esp ;;;;;;;;;; Note this extra sub ;;;;;;;;;;;;
22: leave
23: ret
I am trying to understand the difference in how the caller behaves after the call. Why does the caller in Linux32 do these extra subs?
I would assume that both targets would follow the cdecl calling convention. Doesn't cdecl define the calling convention for a function returning a structure?!
EDIT:
I added an implementation of the callee. And sure enough, you can see that the Linux32 callee pops its argument, while the Win32 callee does not:
AStruct test_callee5()
{
AStruct S={0};
return S;
}
Win32 disassembly:
test_callee5:
00000000: mov eax,dword ptr [esp+4]
00000004: xor ecx,ecx
00000006: mov dword ptr [eax],0
0000000C: mov dword ptr [eax+4],ecx
0000000F: mov dword ptr [eax+8],ecx
00000012: mov dword ptr [eax+0Ch],ecx
00000015: mov dword ptr [eax+10h],ecx
00000018: ret
Linux32 disassembly:
00000000 <test_callee5>:
0: push %ebp
1: mov %esp,%ebp
3: sub $0x20,%esp
6: mov 0x8(%ebp),%edx
9: movl $0x0,0xffffffec(%ebp)
10: movl $0x0,0xfffffff0(%ebp)
17: movl $0x0,0xfffffff4(%ebp)
1e: movl $0x0,0xfffffff8(%ebp)
25: movl $0x0,0xfffffffc(%ebp)
2c: mov 0xffffffec(%ebp),%eax
2f: mov %eax,(%edx)
31: mov 0xfffffff0(%ebp),%eax
34: mov %eax,0x4(%edx)
37: mov 0xfffffff4(%ebp),%eax
3a: mov %eax,0x8(%edx)
3d: mov 0xfffffff8(%ebp),%eax
40: mov %eax,0xc(%edx)
43: mov 0xfffffffc(%ebp),%eax
46: mov %eax,0x10(%edx)
49: mov %edx,%eax
4b: leave
4c: ret $0x4 ;;;;;;;;;;;;;; Note this ;;;;;;;;;;;;;;
There is no single "cdecl" calling convention. It is defined by the compiler and operating system.
Also reading the assembly I am not actually sure the convention is actually different—in both cases the caller is providing buffer for the output as extra argument. It's just that gcc chose different instructions (the second extra sub is strange; is that code optimized?).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With