So I have two functions, one just casts from double
to int64_t
, the other calls std::round
:
std::int64_t my_cast(double d)
{
auto t = static_cast<std::int64_t>(d);
return t;
}
std::int64_t my_round(double d)
{
auto t = std::round(d);
return t;
}
They work correctly: cast(3.64)
= 3
and round(3.64)
= 4
. But, when I look at the assembly, they seem to be doing the same thing. So am wondering how they get different results?
$ g++ -std=c++1y -c -O3 ./round.cpp -o ./round.o
$ objdump -dS ./round.o
./round.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <_Z7my_castd>:
0: f2 48 0f 2c c0 cvttsd2si %xmm0,%rax
5: c3 retq
6: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
d: 00 00 00
0000000000000010 <_Z8my_roundd>:
10: 48 83 ec 08 sub $0x8,%rsp
14: e8 00 00 00 00 callq 19 <_Z7my_castd+0x19> <========!!!
19: 48 83 c4 08 add $0x8,%rsp
1d: f2 48 0f 2c c0 cvttsd2si %xmm0,%rax
22: c3 retq
Disassembly of section .text.startup:
0000000000000030 <_GLOBAL__sub_I__Z7my_castd>:
30: 48 83 ec 08 sub $0x8,%rsp
34: bf 00 00 00 00 mov $0x0,%edi
39: e8 00 00 00 00 callq 3e <_GLOBAL__sub_I__Z7my_castd+0xe>
3e: ba 00 00 00 00 mov $0x0,%edx
43: be 00 00 00 00 mov $0x0,%esi
48: bf 00 00 00 00 mov $0x0,%edi
4d: 48 83 c4 08 add $0x8,%rsp
51: e9 00 00 00 00 jmpq 56 <_Z8my_roundd+0x46>
I am not sure what the purpose of that callq
on line 14
is for, but, even with that, my_cast
and my_round
seem to be just doing a cvttsd2si
which, I believe is conversion with truncation.
However, the two functions, like I mentioned earlier, produce different (correct) values on the same input (say 3.64
)
What is happening?
Assembly output is more useful (g++ ... -S && cat round.s
):
...
_Z7my_castd:
.LFB225:
.cfi_startproc
cvttsd2siq %xmm0, %rax
ret
.cfi_endproc
...
_Z8my_roundd:
.LFB226:
.cfi_startproc
subq $8, %rsp
.cfi_def_cfa_offset 16
call round <<< This is what callq 19 means
addq $8, %rsp
.cfi_def_cfa_offset 8
cvttsd2siq %xmm0, %rax
ret
.cfi_endproc
As you can see, my_round
calls std::round
and then executes cvttsd2siq
instruction. This is because std::round(double)
returns double
, so its result still has to be converted to int64_t
. And that is what cvttsd2siq
is doing in both your functions.
With g++ you can have a higher level view of what's happening using the -fdump-tree-optimized
switch:
$ g++ -std=c++1y -c -O3 -fdump-tree-optimized ./round.cpp
That produces a round.cpp.165t.optimized
file:
;; Function int64_t my_cast(double) (_Z7my_castd, funcdef_no=224, decl_uid=4743$
int64_t my_cast(double) (double d)
{
long int t;
<bb 2>:
t_2 = (long int) d_1(D);
return t_2;
}
;; Function int64_t my_round(double) (_Z8my_roundd, funcdef_no=225, decl_uid=47$
int64_t my_round(double) (double d)
{
double t;
int64_t _3;
<bb 2>:
t_2 = round (d_1(D));
_3 = (int64_t) t_2;
return _3;
}
Here the differences are quite clear (and the call to the round
function glaring).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With