Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What prevents g++ from eliminating temporary std::array not used in runtime?

#include <array>
#include <cassert>

class P {
  public:
    P() : _value(nullptr) {}
    ~P() { delete _value; }

  private:
   char *_value;
};

void foo() {
  if(std::array<P, 4>().size() != 4)
    assert(false);
}

The function foo() creates a temporary array to check the size is what the programmer expected. With -O1 or higher g++ figures out the assert will not fail and the call to __assert_fail is removed from the generated code. But g++ still generates code to first construct and then destruct the now unused array.

g++ -std=c++11 -O3 [4.8.2]:

0000000000000000 <_Z3foov>:1
   0:       55                      push   %rbp1
   1:       66 0f ef c0             pxor   %xmm0,%xmm01
   5:       53                      push   %rbx1
   6:       48 83 ec 28             sub    $0x28,%rsp1
   a:       66 0f 7f 04 24          movdqa %xmm0,(%rsp)1
   f:       48 8d 5c 24 20          lea    0x20(%rsp),%rbx1
  14:       48 89 e5                mov    %rsp,%rbp1
  17:       66 0f 7f 44 24 10       movdqa %xmm0,0x10(%rsp)1
  1d:       0f 1f 00                nopl   (%rax)1
  20:       48 83 eb 08             sub    $0x8,%rbx1
  24:       48 8b 3b                mov    (%rbx),%rdi1
  27:       e8 00 00 00 00          callq  2c <_Z3foov+0x2c>1
  2c:       48 39 eb                cmp    %rbp,%rbx1
  2f:       75 ef                   jne    20 <_Z3foov+0x20>1
  31:       48 83 c4 28             add    $0x28,%rsp1
  35:       5b                      pop    %rbx1
  36:       5d                      pop    %rbp1
  37:       c3                      retq   1

clang on the other hand removes all code except the return statement.

clang -std=c++11 -O3:

0000000000000000 <_Z3foov>:1
   0:       c3                      retq   1

Just bad luck with g++ or is there a reason for the difference?

like image 503
treap Avatar asked Jan 01 '14 15:01

treap


1 Answers

Firstly, great question.

EDIT:

I didn't really read your code properly the first time. There is an important external call in your code. This is in this instruction e8 00 00 00 00 callq 2c <_Z3foov+0x2c>1. It does not call the address ahead of itself - instead it's target will get replaced at link time. This is how linking works - the elf file will say "such and such an instruction will resolve to this target at link time." The assembler has not been listed completely, and so we don't know the address of this call. Presumably it is delete _value in the code, however libstdc++ (with delet, etc) is dynamically linked by default. You can change this by using my compiler flags, or do your listing inside gdb (i.e. after linking).

Back to the answer:

This is a special case where gcc programmers have made a choice not to optimize. The reason being that operators new and delete are marked as 'weak symbols' and the linker will look for alternatives, choosing a user supplied or falling back if none is found.

Here is a discussion of the rationale behind this. These operators are designed to be globally replaceable, and weak linking is one solution.

Statically linking doesn't change this, because free and malloc in glibc can still be changed at linktime.

Statically linking, or using link time optimization should use the in-built delete, however in this case, if you instead linked statically, the chance is still missed. In your original example, however, there is no such chance.

Compiling this:

#include <array>
#include <cassert>
#include <cstdlib>

void * operator new(std::size_t n) throw(std::bad_alloc)
{
  return malloc(n);
}
void operator delete(void * p) throw()
{
if(p != nullptr)
  free(p);
}

class P {
  public:
    P() : _value(nullptr) {}
    ~P() { delete _value; }

  private:
   char *_value;
};

void foo() {
  if(std::array<P, 4>().size() != 4)
    assert(false);
}

int main(){
foo();
}

with this; g++ -std=c++11 -O3 -static -Wa,-alh test.cpp -o test

links to libstdc++/glibc statically, so that it should know what free and malloc are, however it doesn't quite realise foo is trivial.

However, nm -gC test | grep free produces the weak symbol __free_hook, which I found an explanation for here. So the behaviour of free and malloc (and hence, operator new and delete) is actually always changeable at link time when compiling with gcc. That is what prevents the optimization - annoyingly the -fno-weak leaves those symbols there.

The above paragraph is true, but is a result of the optimization being missed, not a reason to avoid optimization.

With link time optimization it is possible gcc can be coaxed into doing this optimization, but first you must build everything else with -flto, including libstdc++.

Here is the most gcc does for me when building the example statically:

Dump of assembler code for function _Z3foov:
0x08048ef0 <+0>:    push   %esi
0x08048ef1 <+1>:    push   %ebx
0x08048ef2 <+2>:    sub    $0x24,%esp
0x08048ef5 <+5>:    movl   $0x0,0x10(%esp)
0x08048efd <+13>:   lea    0x20(%esp),%ebx
0x08048f01 <+17>:   movl   $0x0,0x14(%esp)
0x08048f09 <+25>:   lea    0x10(%esp),%esi
0x08048f0d <+29>:   movl   $0x0,0x18(%esp)
0x08048f15 <+37>:   movl   $0x0,0x1c(%esp)
0x08048f1d <+45>:   lea    0x0(%esi),%esi
0x08048f20 <+48>:   sub    $0x4,%ebx
0x08048f23 <+51>:   mov    (%ebx),%eax
0x08048f25 <+53>:   test   %eax,%eax
0x08048f27 <+55>:   je     0x8048f31 <_Z3foov+65>
0x08048f29 <+57>:   mov    %eax,(%esp)
0x08048f2c <+60>:   call   0x804dea0 <free>
0x08048f31 <+65>:   cmp    %esi,%ebx
0x08048f33 <+67>:   jne    0x8048f20 <_Z3foov+48>
0x08048f35 <+69>:   add    $0x24,%esp
0x08048f38 <+72>:   pop    %ebx
0x08048f39 <+73>:   pop    %esi
0x08048f3a <+74>:   ret  

The call to free isn't going anywhere.

If it is a problem, optimize it yourself. Templates make this easy (assuming std::array is passed as a template argument, otherwise why are you checking it's size()?).

#include <array>
#include <cassert>

class P {
  public:
    P() : _value(nullptr) {}
    ~P() { delete _value; }

  private:
   char *_value;
};

void foo() {
  if(std::tuple_size<std::array<P, 4> >::value != 4)
    assert(false);
}

int main(){
foo();
}

The code can be made to fail silently if std::array<P, 4> is vector or something, and fall back on you default construction method.

nm -C test outputs W operator new(unsigned int, void*) when I added #include <new>, so placement new at least can be changed link time (it is a weak symbol) - the others are special, and their eventual targets reside in libstdc++ (again, they are linked dynamically by default).

like image 76
user3125280 Avatar answered Oct 20 '22 17:10

user3125280