Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sink arguments and move semantics for functions that can fail (strong exception safety)

I have a function that operates on a big chunk of data passed in as a sink argument. My BigData type is already C++11-aware and comes with fully functional move constructor and move assignment implementations, so I can get away without having to copy the damn thing:

Result processBigData(BigData);

[...]

BigData b = retrieveData();
Result r = processBigData(std::move(b));

This all works perfectly fine. However, my processing function may fail occasionally at runtime resulting in an exception. This is not really a problem, since I can just fix stuff and retry:

BigData b = retrieveData();
Result r;
try {
    r = processBigData(std::move(b));
} catch(std::runtime_error&) {
    r = fixEnvironmnentAndTryAgain(b);
    // wait, something isn't right here...
}

Of course, this won't work.

Since I moved my data into the processing function, by the time I arrive in the exception handler, b will not be usable anymore.

This threatens to drastically reduce my enthusiasm for passing sink arguments by-value.

So here is the question: How to deal with a situation like this in modern C++ code? How to retrieve access to data that was previously moved into a function that failed to execute?

You may change the implementation and interfaces for both BigData and processBigData as you please. The final solution however should try to minimize drawbacks over the original code regarding efficiency and usability.

like image 740
ComicSansMS Avatar asked Sep 04 '14 13:09

ComicSansMS


2 Answers

I'm similarly nonplussed by this issue.

As far as I can tell, the best current idiom is to divide the pass-by-value into a pair of pass-by-references.

template< typename t >
std::decay_t< t >
val( t && o ) // Given an object, return a new object "val"ue by move or copy
    { return std::forward< t >( o ); }

Result processBigData(BigData && in_rref) {
    // implementation
}

Result processBigData(BigData const & in_cref ) {
    return processBigData( val( in_cref ) );
}

Of course, bits and pieces of the argument might have been been moved before the exception. The problem propagates out to whatever processBigData calls.

I've had an inspiration to develop an object that moves itself back to its source upon certain exceptions, but that's a solution to a particular problem on the horizon in one of my projects. It might end up too specialized, or it might not be feasible at all.

like image 177
Potatoswatter Avatar answered Nov 17 '22 05:11

Potatoswatter


Apparently this issue was discussed lively at the recent CppCon 2014. Herb Sutter summarized the latest state of things in his closing talk, Back to the Basics! Essentials of Modern C++ Style (slides).

His conclusion is quite simply: Don't use pass-by-value for sink arguments.

The arguments for using this technique in the first place (as popularized by Eric Niebler's Meeting C++ 2013 keynote C++11 Library design (slides)) seem to be outweighed by the disadvantages. The initial motivation for passing sink arguments by-value was to get rid of the combinatorial explosion for function overloads that results from using const&/&&.

Unfortunately, it seems that this brings a number of unintended consequences. One of which are potential efficiency drawbacks (mainly due to unnecessary buffer allocations). The other is the problem with exception safety from this question. Both of these are discussed in Herb's talk.

Herb's conclusion is to not use pass-by-value for sink arguments, but instead rely on separate const&/&& (with const& being the default and && reserved for those few cases where optimization is required).

This also matches with what @Potatoswatter's answer suggested. By passing the sink argument via && we might be able to defer the actual moving of the data from the argument to a point where we can give a noexcept guarantee.

I kind of liked the idea of passing sink arguments by-value, but it seems that it does not hold up as well in practice as everyone hoped.

Update after thinking about this for 5 years:

I am now convinced that my motivating example is a misuse of move semantics. After the invocation of processBigData(std::move(b));, I should never be allowed to assume what the state of b is, even if the function exits with an exception. Doing so leads to code that is hard to follow and to maintain.

Instead, if the contents of b should be recoverable in the error case, this needs to be made explicit in the code. For example:

class BigDataException : public std::runtime_error {
private:
    BigData b;
public:
    BigData retrieveDataAfterError() &&;

    // [...]
};


BigData b = retrieveData();
Result r;
try {
    r = processBigData(std::move(b));
} catch(BigDataException& e) {
    b = std::move(e).retrieveDataAfterError();
    r = fixEnvironmnentAndTryAgain(std::move(b));
}

If I want to recover the contents of b, I need to explicitly pass them out along the error path (in this case wrapped inside the BigDataException). This approach requires a bit of additional boilerplate, but it is more idiomatic in that it does not require making assumptions about the state of a moved-from object.

like image 2
ComicSansMS Avatar answered Nov 17 '22 03:11

ComicSansMS