Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Non-deterministic corruption with lambdas in C++11

Inspired by Herb Sutter's compelling lecture Not your father's C++, I decided to take another look at the latest version of C++ using Microsoft's Visual Studio 2010. I was particularly interested by Herb's assertion that C++ is "safe" because I hadn't heard how C++11 solved the well-known upwards funarg problem. From what I can tell, C++11 does nothing to solve this problem and, consequently, is not "safe".

You don't want to return a reference to a local variable because the local is allocated on a stack frame that will no longer exist after the function has returned and, therefore, the function will return a dangling pointer to allocated memory that will cause non-deterministic data corruption. C and C++ compilers know this and warn you if you try to return a reference or pointer to a local. For example, this program:

int &bar() {
  int n=0;
  return n;
}

causes Visual Studio 2010 to emit the warning:

warning C4172: returning address of local variable or temporary

However, lambdas in C++11 make it easy to capture a local variable by reference and return that reference, resulting in an equivalent dangling pointer. Consider the following function foo that returns a lambda function that captures the local variable n and returns it:

#include <functional>

std::function<int()> foo(int n) {
  return [&](){return n;};
}

This innocuous-looking function is memory unsafe and a source of corrupt data. Calling this function to get the lambda in one place and then invoking the lambda and printing its return value in another place gives this output for me:

1825836376

Moreover, Visual Studio 2010 gives no warning.

This looks like a really serious design flaw in the language to me. Even the simplest refactoring could make a lambda cross stack frames, silently introducing non-deterministic data corruption. Yet there seems to be precious little information about this problem (e.g. searching for "upwards funarg" and C++ on StackOverflow gives no hits). Are people aware of this? Is anyone working on a solution or describing workarounds?

like image 977
J D Avatar asked Apr 12 '12 09:04

J D


People also ask

What is the correct syntax for lambda expression in C++11?

Lambdas can both capture variables and accept input parameters. A parameter list (lambda declarator in the Standard syntax) is optional and in most aspects resembles the parameter list for a function. auto y = [] (int first, int second) { return first + second; };

Can a lambda closure be used to create a C++11 thread?

Can you create a C++11 thread with a lambda closure that takes a bunch of arguments? Yes – just like the previous case, you can pass the arguments needed by the lambda closure to the thread constructor.

What does [=] mean in lambda function?

The [=] you're referring to is part of the capture list for the lambda expression. This tells C++ that the code inside the lambda expression is initialized so that the lambda gets a copy of all the local variables it uses when it's created.

Are lambdas faster than functions C++?

They can capture their environment by value, by reference, and with C++14 by move. If the functionality of your callable is short and self-explanatory, use a lambda function. A lambda function is, in general, faster and easier to understand.


1 Answers

This is not specific to lambdas, you can do tons of bad things when it comes to lifetime (and you've noted at least one case of it). While C++11 might be safer in several respects compared to C++03, C++ hasn't put an emphasis on memory-safety.

That's not to say that C++ doesn't want to be safe, but I'd say the usual philosophy "don't pay for what you don't use" usually gets in the way of adding safety guards (not considering things like the halting problem which might prevent issuing diagnostics for all invalid programs). If you can solve the upwards funarg problem while not affecting the performance of every other case then the Standard Committee is interested. (I don't mean that in a mean way, I think it's an interesting and hard problem.)

Since you seem to be doing some catch-up, then the wisdom of authors (and others) so far is to generally refrain from using the by-reference catch-all capture for lambda expressions (e.g. [&, foo, bar]), and to be careful with by-reference capture in general. You can think of the capture-list of a lambda-expression as another place in C++ where you have to be careful with lifetimes; or another view is to consider a lambda-expression as a object literal notation for functors (they're specified that way in fact). You have to be careful when you design a class type with regards to lifetime already:

struct foo {
    explicit foo(T& t)
        : ref(t)
    {}

    T& ref;
};

foo make_foo()
{
    T t;
    // Bad
    return foo { t };
    // Not altogether different from
    // return [&t] {};
}

In this respect lambda expressions don't change the status quo when it comes to writing 'obvious' bad code, and they inherit all the preexisting caveats.

like image 148
Luc Danton Avatar answered Sep 30 '22 21:09

Luc Danton