Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does clang ignore __restrict__?

I just tested a small example to check whether __restrict__ works in C++ on the latest compilers:

void foo(int x,int* __restrict__ ptr1, int& v2) {
   for(int i=0;i<x;i++) {
       if(*ptr1==v2) {
           ++ptr1;
       } else {
           *ptr1=*ptr1+1;
       }
   }
}

When trying it on godbolt.org with the latest gcc (gcc8.1 -O3 -std=c++14), the __restrict__ works as expected: v2 is loaded only once, since it cannot alias with ptr1.

Here are the relevant assembly parts:

.L5:
  mov eax, DWORD PTR [rsi]
  cmp eax, ecx # <-- ecx contains v2, no load from memory
  jne .L3
  add edx, 1
  add rsi, 4
  cmp edi, edx
  jne .L5

Now the same with the latest clang (clang 6.0.0 -O3 -std=c++14). It unrolls the loop once, so the generated code is much bigger, but here is the gist:

.LBB0_3: # =>This Inner Loop Header: Depth=1
  mov edi, dword ptr [rsi]
  cmp edi, dword ptr [rdx] # <-- restrict didn't work, v2 loaded from memory in hot loop
  jne .LBB0_9
  add rsi, 4
  mov edi, dword ptr [rsi]
  cmp edi, dword ptr [rdx] # <-- restrict didn't work, v2 loaded from memory in hot loop
  je .LBB0_12

Why is this the case? I know that __restrict__ is non-standard and the compiler is free to ignore it, but it seems to be a very fundamental technique for getting the last bit of performance out of ones code, so I doubt that clang simply does not support it while supporting and ignoring the keyword itself. So, what is the issue here? Am I doing anything wrong?

like image 390
gexicide Avatar asked May 16 '18 07:05

gexicide


People also ask

Does clang define __ GNUC __?

(GNU C is a language, GCC is a compiler for that language.Clang defines __GNUC__ / __GNUC_MINOR__ / __GNUC_PATCHLEVEL__ according to the version of gcc that it claims full compatibility with.

Does clang optimize better than GCC?

Clang is much faster and uses far less memory than GCC. Clang aims to provide extremely clear and concise diagnostics (error and warning messages), and includes support for expressive diagnostics. GCC's warnings are sometimes acceptable, but are often confusing and it does not support expressive diagnostics.

Why does Clang use GCC?

Clang is designed to provide a frontend compiler that can replace GCC. Apple Inc. (including NeXT later) has been using GCC as the official compiler. GCC has always performed well as a standard compiler in the open source community.

How do you compile with Clang?

2.4. To compile a C++ program on the command line, run the clang++ compiler as follows: $ scl enable llvm-toolset-6.0 'clang++ -o output_file source_file ...' This creates a binary file named output_file in the current working directory. If the -o option is omitted, the clang++ compiler creates a file named a.


1 Answers

So many useless comments...

This seems to be a bug in Clang alias analyzer. If you change type of v2 to short compiler happily removes it from the loop based on type-based aliasing rules:

for.body:                                         ; preds = %for.inc, %for.body.lr.ph
  %i.09 = phi i32 [ 0, %for.body.lr.ph ], [ %inc, %for.inc ]
  %ptr1.addr.08 = phi i32* [ %ptr1, %for.body.lr.ph ], [ %ptr1.addr.1, %for.inc ]
  %1 = load i32, i32* %ptr1.addr.08, align 4, !tbaa !5
  %cmp1 = icmp eq i32 %1, %conv
  br i1 %cmp1, label %if.then, label %if.else

But with original loop you get the same alias set for both memory references, which is why middle-end can't optimize it:

  %i.08 = phi i32 [ %inc, %for.inc ], [ 0, %for.body.preheader ]
  %ptr1.addr.07 = phi i32* [ %ptr1.addr.1, %for.inc ], [ %ptr1, %for.body.preheader ]
  %0 = load i32, i32* %ptr1.addr.07, align 4, !tbaa !1
  %1 = load i32, i32* %v2, align 4, !tbaa !1
  %cmp1 = icmp eq i32 %0, %1
  br i1 %cmp1, label %if.then, label %if.else

Note the !tbaa !1 attached to both memory references which means that compiler couldn't distinguish memory accessed by either of them. It seems that restrict annotation has been lost along the way...

I encourage you to reproduce this with latest Clang and file a bug in LLVM Bugzilla (be sure to cc Hal Finkel).

like image 66
yugr Avatar answered Oct 08 '22 16:10

yugr