Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Range-based for loop on a temporary range [duplicate]

Thanks to some segmentation faults and warnings in valgrind, I found that this code is incorrect and has some sort of dangling reference in the for-range loop.

#include<numeric>
#include<vector>

auto f(){
    std::vector<std::vector<double>> v(10, std::vector<double>(3));
    iota(v[5].begin(), v[5].end(), 0);
    return v;
}

int main(){
    for(auto e : f()[5])
        std::cout << e << std::endl;
    return 0;
}

It looks as if the begin and end is taken from a temporary and lost in the loop.

Of course, a way around is to do

    auto r = f()[5];
    for(auto e : r)
        std::cout << e << std::endl;

However, I wonder exactly why for(auto e : f()[5]) is an error and also if there is a better way around or some way to design f or the even the container (std::vector) to avoid this pitfall.

With iterator loops is more obvious why this problem happens (begin and end come from different temporary objects)

for(auto it = f()[5].begin(); it != f()[5].end(); ++it)

But in a for-range loop, as in the first example, it seems very easy to make this mistake.

like image 596
alfC Avatar asked Jul 20 '18 06:07

alfC


1 Answers

I wonder exactly why for(auto e : f()[5]) is an error

I'll just answer this part. The reason is that range-based for statements are just syntactic sugar for, approximately:

{
    auto&& __range = f()[5]; // (*)
    auto __begin = __range.begin(); // not exactly, but close enough
    auto __end = __range.end();     // in C++17, these types can be different
    for (; __begin != __end; ++__begin) {
        auto e = *__begin;
        // rest of body
    }
}

Take a look at that first line. What happens? operator[] on a vector returns a reference into that object, so __range is bound to that internal reference. But then the temporary goes out of scope at the end of the line, destroying all of its internals, and __range is immediately a dangling reference. There is no lifetime extension here, we are never binding a reference to a temporary.

In the more normal case, for(auto e : f()), we'd bind __range to f() directly, which is binding a reference to a temporary, so that temporary would get its lifetime extended to the lifetime of the reference, which would be the full for statement.

To add more wrinkles, there are other cases where indirect binding like this would still do lifetime extension. Like, say:

struct X {
    std::vector<int> v;
};
X foo();

for (auto e : foo().v) {
    // ok!
}

But rather than try to keep track of all of these little cases, it's much better to, as songyuanyao suggests, use the new for statement with initializer... all the time:

for (auto&& range = f(); auto e : range[5]) {
    // rest of body
}

Although in a way this gives a false sense of security, since if you did it twice, you'd still have the same issue...

for (auto&& range = f().g(); auto e : range[5]) {
    // still dangling reference
}
like image 149
Barry Avatar answered Nov 08 '22 17:11

Barry