Why does clang ignore restrict?

Tags:

I just tested a small example to check whether __restrict__ works in C++ on the latest compilers:

void foo(int x,int* __restrict__ ptr1, int& v2) {
   for(int i=0;i<x;i++) {
       if(*ptr1==v2) {
           ++ptr1;
       } else {
           *ptr1=*ptr1+1;
       }
   }
}

When trying it on godbolt.org with the latest gcc (gcc8.1 -O3 -std=c++14), the __restrict__ works as expected: v2 is loaded only once, since it cannot alias with ptr1.

Here are the relevant assembly parts:

Click to copy

.L5:
  mov eax, DWORD PTR [rsi]
  cmp eax, ecx # <-- ecx contains v2, no load from memory
  jne .L3
  add edx, 1
  add rsi, 4
  cmp edi, edx
  jne .L5

Now the same with the latest clang (clang 6.0.0 -O3 -std=c++14). It unrolls the loop once, so the generated code is much bigger, but here is the gist:

Click to copy

.LBB0_3: # =>This Inner Loop Header: Depth=1
  mov edi, dword ptr [rsi]
  cmp edi, dword ptr [rdx] # <-- restrict didn't work, v2 loaded from memory in hot loop
  jne .LBB0_9
  add rsi, 4
  mov edi, dword ptr [rsi]
  cmp edi, dword ptr [rdx] # <-- restrict didn't work, v2 loaded from memory in hot loop
  je .LBB0_12

Why is this the case? I know that __restrict__ is non-standard and the compiler is free to ignore it, but it seems to be a very fundamental technique for getting the last bit of performance out of ones code, so I doubt that clang simply does not support it while supporting and ignoring the keyword itself. So, what is the issue here? Am I doing anything wrong?

390

asked May 16 '18 07:05

gexicide

1 Answers

So many useless comments...

This seems to be a bug in Clang alias analyzer. If you change type of v2 to short compiler happily removes it from the loop based on type-based aliasing rules:

Click to copy

for.body:                                         ; preds = %for.inc, %for.body.lr.ph
  %i.09 = phi i32 [ 0, %for.body.lr.ph ], [ %inc, %for.inc ]
  %ptr1.addr.08 = phi i32* [ %ptr1, %for.body.lr.ph ], [ %ptr1.addr.1, %for.inc ]
  %1 = load i32, i32* %ptr1.addr.08, align 4, !tbaa !5
  %cmp1 = icmp eq i32 %1, %conv
  br i1 %cmp1, label %if.then, label %if.else

But with original loop you get the same alias set for both memory references, which is why middle-end can't optimize it:

Click to copy

  %i.08 = phi i32 [ %inc, %for.inc ], [ 0, %for.body.preheader ]
  %ptr1.addr.07 = phi i32* [ %ptr1.addr.1, %for.inc ], [ %ptr1, %for.body.preheader ]
  %0 = load i32, i32* %ptr1.addr.07, align 4, !tbaa !1
  %1 = load i32, i32* %v2, align 4, !tbaa !1
  %cmp1 = icmp eq i32 %0, %1
  br i1 %cmp1, label %if.then, label %if.else

Note the !tbaa !1 attached to both memory references which means that compiler couldn't distinguish memory accessed by either of them. It seems that restrict annotation has been lost along the way...

I encourage you to reproduce this with latest Clang and file a bug in LLVM Bugzilla (be sure to cc Hal Finkel).

answered Oct 08 '22 16:10

yugr

Related questions
                            
                                Why does std::map< std::map > not deallocate memory?
                            
                                Zero-copy Camera Processing and Rendering Pipeline on Android
                            
                                C++ using operator int() instead of operator+
                            
                                Why can't I put a pointer to const on right hand side of assignment?
                            
                                Const-correct accessor to vector of pointers without transfer of ownership in abstract interface
                            
                                What is the difference between standard library implementations in C++?
                            
                                Marking std::unique_ptr class member as const
                            
                                How to know when there is input through the terminal pipe line on C++ 11?
                            
                                Where in the Standard does it say that the default member initializer for U::j should be ignored by the compiler?
                            
                                Why does std::is_array return false for std::array?
                            
                                Moving unique_ptr in the declaration of a vector [duplicate]
                            
                                Is this a bug of gcc?
                            
                                Why is std::get<T> for `variant` a global function?
                            
                                No ambiguous reference error even after using namespace directive
                            
                                Missing destructor in Visual Studio?
                            
                                c++11 decltype(e) is the type of the entity named by e
                            
                                How to pass around parameter packs in C++?
                            
                                C++17, deprecated functions in <memory> standard library?
                            
                                Setting CMAKE_CXX_STANDARD to various values
                            
                                using std::initializer_list as a member variable [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why does clang ignore restrict?

Tags:

c++

gcc

clang

restrict

gexicide

People also ask

1 Answers

yugr

Recent Activity

Donate For Us

Why does clang ignore __restrict__?

Tags:

c++

gcc

clang

restrict

gexicide

People also ask

1 Answers

yugr

Related questions

Recent Activity

Donate For Us

Why does clang ignore restrict?