Seemingly missing optimization for calling const-vector size() in loop condition

Question

Consider the following code:

int g(std::vector<int>&, size_t);

int f(std::vector<int>& v) {
   int res = 0;
   for (size_t i = 0; i < v.size(); i++)
      res += g(v, i);
   return res;
}

The compiler cannot optimize away the evaluation of v.size() within the loop, since it cannot prove that the size won't change inside g. The assembly generated with GCC 9.2, -O3, and x64 is:

.L3:
    mov     rsi, rbx
    mov     rdi, rbp
    add     rbx, 1
    call    g(std::vector<int, std::allocator<int> >&, unsigned long)
    add     r12d, eax
    mov     rax, QWORD PTR [rbp+8] // load a poniter
    sub     rax, QWORD PTR [rbp+0] // subtract another pointetr
    sar     rax, 2                 // result * sizeof(int) => size()
    cmp     rbx, rax
    jb      .L3

If we know that g does not alter v.size(), we can rewrite the loop as follows:

for (size_t i = 0, e = v.size(); i < e; i++)
   res += g(v);

This generates simpler (and thus faster) assembly without those mov, sub, and sar instructions. The value of size() is simply kept in a register.

I would expect that the same effect could be achieved by making the vector const (I know it's changing the semantics of the program since g now cannot alter vector's elements, but this should be irrelevant to the question):

int g(const std::vector<int>&, size_t);

int f(const std::vector<int>& v) {
   int res = 0;
   for (size_t i = 0; i < v.size(); i++)
      res += g(v, i);
   return res;
}

Now, the compiler should know that those pointers loaded in each loop iteration cannot change and, therefore, that the result of

    mov     rax, QWORD PTR [rbp+8] 
    sub     rax, QWORD PTR [rbp+0] 
    sar     rax, 2

is always the same. Despite that, these instructions are present in the generated assembly; live demo is here.

I have also tried Intel 19, Clang 9, and MSVC 19 with the very same results. Since all the mainstream compilers behave in such a uniform way, I wonder whether there is some rule that disallows this kind of optimization, i.e., moving the evaluation of size() for a constant vector out of the loop.

Max Langhof · Accepted Answer

Adding const does not mean that g cannot change the size of the vector. Sadly.

You get UB if you modify an object that is itself const. Modifying an object to which you have a const reference is not UB, as long as the original (i.e. referenced) object is not const.

In other words, this is a valid program:

int g(const std::vector<int>& v, size_t)
{
    const_cast<std::vector<int>&>(v).clear();
    return 0;
}

int f(const std::vector<int>& v) {
   int res = 0;
   for (size_t i = 0; i < v.size(); i++)
      res += g(v, i);
   return res;
}

void test()
{
    std::vector<int> v;
    f(v);
}

Note that, if we cannot see the definition of g, the const doesn't even matter even if we disallowed const_cast:

std::vector<int> global_v;

int g(const std::vector<int>& v, size_t)
{
    global_v.clear();
    return 0;
}

int f(const std::vector<int>& v) {
   int res = 0;
   for (size_t i = 0; i < v.size(); i++)
      res += g(v, i);
   return res;
}

void test()
{
    f(global_v);
}

Consider const on references as a reminder and compiler-enforced safeguard for yourself (and others), but not so much as an optimization opportunity for the compiler. However, marking values/objects as const themselves has decent chances of improving optimizations - passing an actual const Something to an opaque function guarantees that Something holds the same thing afterwards (but note that const is not transitive through e.g. pointer or reference members).

bartop · Answer

Sad truth is const in function parameters is not that useful for optimization. Imagine code like this (I know it is not good practice and so forth but for the sake of argument):

int g(const std::vector<int>&v){
    const_cast<std::vector<int> &>(v).push_back(1);
    return 0;
}

int f(const std::vector<int>& v) {
   int res = 0;
   for (size_t i = 0; i < v.size(); i++)
      res += g(v);
   return res;
}

int main(){
    std::vector<int> v{1, 2, 3};
    return f(v);
}

If vector passed to f is not const itself, this is legal code.

Seemingly missing optimization for calling const-vector size() in loop condition

Tags:

c++

loops

optimization

constants

vector

Daniel Langr

2 Answers

Max Langhof

bartop

Recent Activity

Donate For Us

Seemingly missing optimization for calling const-vector size() in loop condition

Tags:

c++

loops

optimization

constants

vector

Daniel Langr

2 Answers

Max Langhof

bartop

Related questions

Recent Activity

Donate For Us