Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can `memset` function call be removed by compiler?

Tags:

I have read here that compiler is free to remove call to memset if it knows that passed memory buffer is never used again. How is that possible? It seems to me that (from the point of view of core language) memset is just a regular function, and compiler has no right to assume that whatever happens inside it, will have no side effects.

In linked article they show how Visual C++ 10 removed memset. I know that Microsoft compilers are not leading in standard compliance, so I ask - is it according to standard, or is it just msvc-ism? If it's according to standard, please elaborate ;)

EDIT: @Cubbi

Following code:

void testIt(){
    char foo[1234];
    for (int i=0; i<1233; i++){
        foo[i] = rand()%('Z'-'A'+1)+'A';
    }
    foo[1233]=0;
    printf(foo);
    memset(foo, 0, 1234);
}

Compiled under mingw with lines:

g++ -c -O2 -frtti -fexceptions -mthreads -Wall -DUNICODE -o main.o main.cpp
g++ -Wl,-s -Wl,-subsystem,console -mthreads -o main.exe main.o
objdump -d -M intel -S main.exe > dump.asm

Gave output:

 4013b0:    55                      push   ebp
 4013b1:    89 e5                   mov    ebp,esp
 4013b3:    57                      push   edi
 4013b4:    56                      push   esi
 4013b5:    53                      push   ebx
 4013b6:    81 ec fc 04 00 00       sub    esp,0x4fc
 4013bc:    31 db                   xor    ebx,ebx
 4013be:    8d b5 16 fb ff ff       lea    esi,[ebp-0x4ea]
 4013c4:    bf 1a 00 00 00          mov    edi,0x1a
 4013c9:    8d 76 00                lea    esi,[esi+0x0]
 4013cc:    e8 6f 02 00 00          call   0x401640
 4013d1:    99                      cdq    
 4013d2:    f7 ff                   idiv   edi
 4013d4:    83 c2 41                add    edx,0x41
 4013d7:    88 14 1e                mov    BYTE PTR [esi+ebx*1],dl
 4013da:    43                      inc    ebx
 4013db:    81 fb d1 04 00 00       cmp    ebx,0x4d1
 4013e1:    75 e9                   jne    0x4013cc
 4013e3:    c6 45 e7 00             mov    BYTE PTR [ebp-0x19],0x0
 4013e7:    89 34 24                mov    DWORD PTR [esp],esi
 4013ea:    e8 59 02 00 00          call   0x401648
 4013ef:    81 c4 fc 04 00 00       add    esp,0x4fc
 4013f5:    5b                      pop    ebx
 4013f6:    5e                      pop    esi
 4013f7:    5f                      pop    edi
 4013f8:    c9                      leave  
 4013f9:    c3                      ret   

In line 4013ea there is memset call, so mingw haven't removed it. Since mingw is really GCC in windows skin, I suppose GCC does it the same - I will check it when I reboot into linux.

Still having trouble finding such compiler?

EDIT2:

I just found out about GCC's __attribute__ ((pure)). So it's not that compiler knows something special about memset and elides it, it's just that it's allowed in it's header - where programmer using it should also see it ;) My mingw doesn't have this attribute in memset declaration, thus it's not eliding from the assembly no matter what - as I would expect. I will have to investigate this.

like image 965
j_kubik Avatar asked Mar 21 '13 02:03

j_kubik


People also ask

Does memset write to memory?

memset() will blindly write to the specified address for the number of specified bytes, regardless of what it might be overwriting. It is up to the programmer to ensure that only valid memory is written to.

Is memset faster than a loop?

You can assume that memset will be at least as fast as a naive implementation such as the loop. Try it under a debug build and you will notice that the loop is not replaced. That said, it depends on what the compiler does for you. Looking at the disassembly is always a good way to know exactly what is going on.

What does __ Builtin_unreachable do?

The purpose of __builtin_unreachable is to help the compiler to: Remove dead code (that programmer knows will never be executed) Linearize the code by letting compiler know that the path is "cold" (similar effect is achieved by calling noreturn function)

What is memset stackoverflow?

memset() is usually used to initialise values. For example consider the following struct: struct Size { int width; int height; } If you create one of these on the stack like so: struct Size someSize; Then the values in that struct are going to be undefined.


Video Answer


1 Answers

"compiler has no right to assume that whatever happens inside it, will have no side effects."

That's correct. But if the compiler in fact knows what actually happens inside it and can determine that it really has no side effects, then no assumption is needed.

This is how almost all compiler optimizations work. The code says "X". The compiler determines that if "Y" is true, then it can replace code "X" with code "Z" and there will be no detectable difference. It determines "Y" is true, and then it replaces "X" with "Z".

For example:

void func()
{
  int j = 2;
  foo();
  if (j == 2) bar();
   else baz();
}

The compiler can optimize this to foo(); bar();. The compiler can see that foo cannot legally modify the value of j. If foo() somehow magically figures out where j is on the stack and modifies it, then the optimization will change the behavior of the code, but that's the programmer's fault for using "magic".

void func()
{
  int j = 2;
  foo(&j);
  if (j == 2) bar();
   else baz();
}

Now it can't because foo can legally modify the value of j without any magic. (Assuming the compiler can't look inside foo, which in some cases it can.)

If you do "magic", then the compiler can make optimizations that break your code. Stick to the rules and don't use magic.

In the example you linked to, the code relies on the compiler bothering to put a particular value in a variable that is never accessed and immediately ceases to exist. The compiler is not required to do anything that has no effect on the operation of your code.

The only way that could effect the code is if it peeked at unallocated portions of the stack or relied on new allocations on the stack having values they previously had. Requiring the compiler to do that would make a huge number of optimizations impossible, including replacing local variables with registers.

like image 160
David Schwartz Avatar answered Sep 20 '22 16:09

David Schwartz