So I have two functions, one just casts from double to int64_t, the other calls std::round:
std::int64_t my_cast(double d)
{
  auto t = static_cast<std::int64_t>(d);
  return t;
}
std::int64_t my_round(double d)
{
  auto t = std::round(d);
  return t;
}
They work correctly: cast(3.64) = 3 and round(3.64) = 4. But, when I look at the assembly, they seem to be doing the same thing. So am wondering how they get different results?
$ g++ -std=c++1y -c -O3 ./round.cpp -o ./round.o 
$ objdump -dS ./round.o
./round.o:     file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <_Z7my_castd>:
   0:   f2 48 0f 2c c0          cvttsd2si %xmm0,%rax
   5:   c3                      retq
   6:   66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
   d:   00 00 00
0000000000000010 <_Z8my_roundd>:
  10:   48 83 ec 08             sub    $0x8,%rsp
  14:   e8 00 00 00 00          callq  19 <_Z7my_castd+0x19> <========!!!
  19:   48 83 c4 08             add    $0x8,%rsp
  1d:   f2 48 0f 2c c0          cvttsd2si %xmm0,%rax
  22:   c3                      retq
Disassembly of section .text.startup:
0000000000000030 <_GLOBAL__sub_I__Z7my_castd>:
  30:   48 83 ec 08             sub    $0x8,%rsp
  34:   bf 00 00 00 00          mov    $0x0,%edi
  39:   e8 00 00 00 00          callq  3e <_GLOBAL__sub_I__Z7my_castd+0xe>
  3e:   ba 00 00 00 00          mov    $0x0,%edx
  43:   be 00 00 00 00          mov    $0x0,%esi
  48:   bf 00 00 00 00          mov    $0x0,%edi
  4d:   48 83 c4 08             add    $0x8,%rsp
  51:   e9 00 00 00 00          jmpq   56 <_Z8my_roundd+0x46>
I am not sure what the purpose of that callq on line 14 is for, but, even with that, my_cast and my_round seem to be just doing a cvttsd2si which, I believe is conversion with truncation.
However, the two functions, like I mentioned earlier, produce different (correct) values on the same input (say 3.64)
What is happening?
Assembly output is more useful (g++ ... -S && cat round.s):
...
_Z7my_castd:
.LFB225:
    .cfi_startproc
    cvttsd2siq  %xmm0, %rax
    ret
    .cfi_endproc
...
_Z8my_roundd:
.LFB226:
    .cfi_startproc
    subq    $8, %rsp
    .cfi_def_cfa_offset 16
    call    round             <<< This is what callq 19 means
    addq    $8, %rsp
    .cfi_def_cfa_offset 8
    cvttsd2siq  %xmm0, %rax
    ret
    .cfi_endproc
As you can see, my_round calls std::round and then executes cvttsd2siq instruction. This is because std::round(double) returns double, so its result still has to be converted to int64_t. And that is what cvttsd2siq is doing in both your functions.
With g++ you can have a higher level view of what's happening using the -fdump-tree-optimized switch:
$ g++ -std=c++1y -c -O3 -fdump-tree-optimized ./round.cpp
That produces a round.cpp.165t.optimized file:
;; Function int64_t my_cast(double) (_Z7my_castd, funcdef_no=224, decl_uid=4743$
int64_t my_cast(double) (double d)
{
  long int t;
  <bb 2>:
  t_2 = (long int) d_1(D);
  return t_2;
}
;; Function int64_t my_round(double) (_Z8my_roundd, funcdef_no=225, decl_uid=47$
int64_t my_round(double) (double d)
{
  double t;
  int64_t _3;
  <bb 2>:
  t_2 = round (d_1(D));
  _3 = (int64_t) t_2;
  return _3;
}
Here the differences are quite clear (and the call to the round function glaring).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With