Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are gcc and clang not hoisting strlen out of this loop?

Consider the following code:

#include <string.h>

void bar(char c);

void foo(const char* restrict ss) {
    for (int i = 0; i < strlen(ss); ++i) {
        bar(*ss);
    }
}    

I would expect the strlen(ss) to be hoisted out of the loop in these essentially ideal conditions; and yet - it isn't, neither by clang 5.0 nor by gcc 7.3 with maximum optimization (-O3).

Why is this the case?

Note: Inspired by (my answer to) this question.

like image 953
einpoklum Avatar asked Jan 28 '18 00:01

einpoklum


1 Answers

The other answers claim that the strlen call cannot be hoisted because the string's contents might change between calls. Those answers do not properly account for the semantics of restrict; even if bar had access to the string through a global variable or some other mechanism, the semantics of restrict pointers to const types should(see caveat) prohibit bar from modifying the string.

From C11, N1570 draft, 6.7.3.1:

1 Let D be a declaration of an ordinary identifier that provides a means of designating an object P as a restrict-qualified pointer to type T.

2 If D appears inside a block and does not have storage class extern, let B denote the block. If D appears in the list of parameter declarations of a function definition, let B denote the associated block. Otherwise, let B denote the block of main (or the block of whatever function is called at program startup in a freestanding environment).

3 In what follows, a pointer expression E is said to be based on object P if (at some sequence point in the execution of B prior to the evaluation of E) modifying P to point to a copy of the array object into which it formerly pointed would change the value of E.137) Note that ''based'' is defined only for expressions with pointer types.

4 During each execution of B, let L be any lvalue that has &L based on P. If L is used to access the value of the object X that it designates, and X is also modified (by any means), then the following requirements apply: T shall not be const-qualified. Every other lvalue used to access the value of X shall also have its address based on P. Every access that modifies X shall be considered also to modify P, for the purposes of this subclause. If P is assigned the value of a pointer expression E that is based on another restricted pointer object P2, associated with block B2, then either the execution of B2 shall begin before the execution of B, or the execution of B2 shall end prior to the assignment. If these requirements are not met, then the behavior is undefined.

5 Here an execution of B means that portion of the execution of the program that would correspond to the lifetime of an object with scalar type and automatic storage duration associated with B.

Here, the declaration D is const char* __restrict__ ss, and the associated block B is the body of foo. All lvalues through which strlen accesses the string have &L based on ss(see caveat), and those accesses occur during the execution of B (since, by the definition in section 5, the execution of strlen is part of the execution of B). ss points to a const-qualified type, so by section 4, the compiler is allowed to assume string elements accessed by strlen are not modified during the execution of foo; modifying them would be undefined behavior.

(caveat)The above analysis assumes that strlen accesses the string through "ordinary" pointer dereferencing or indexing. If strlen uses techniques like SSE intrinsics or inline assembly, it is not clear to me whether such accesses technically count as using an lvalue to access the value of the object that it designates. If they do not count as such, the protections of restrict may not apply, and the compiler may not be able to perform the hoisting.

Maybe the above caveat voids the protections of restrict. Maybe the compiler doesn't know enough about the definition of strlen to analyze its interaction with restrict (I'm surprised it wasn't inlined). Maybe the compiler is free to perform the hoisting and just didn't realize it; perhaps some relevant optimization is not implemented, or it failed to propagate the necessary information between the right compiler components. Determining the exact cause would require much more familiarity with the GCC and Clang internals than I possess.

Further-simplified tests that eliminate strlen and the loop show that Clang definitely has some support for restrict-pointer-to-const optimizations, but I wasn't able to observe any such support from GCC.

like image 53
user2357112 supports Monica Avatar answered Oct 14 '22 15:10

user2357112 supports Monica