Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In which cases is it unsafe to (const) reference a return value?

Tags:

c++

I tend do use the following notation a lot:

const Datum& d = a.calc();
// continue to use d

This works when the result of calc is on the stack, see http://herbsutter.com/2008/01/01/gotw-88-a-candidate-for-the-most-important-const/. Even though compilers probably optimize here, it feels nice to explicitly avoid temporary objects.

Today I realized, that the content of d became invalid after data was written to a member of a. In this particular case the get function simply returned a reference to another member, which was completely unrelated to the write however. Like this:

const Datum& d = a.get();
// ... some operation ...
a.somedata = Datum2();
// now d is invalid.

Again, somedata has nothing to do with d or get() here.

Now I ask myself:

  1. Which side-effects could lead to the invalidation?
  2. Is it bad-practice to assign return values to a const reference? (Especially when I don't know the interior of the function)

My application is single-threaded except for a Qt GUI-Thread.

like image 267
Martin R. Avatar asked Dec 24 '22 15:12

Martin R.


2 Answers

You seem to be afraid of elision failing. That auto x = some_func(); will result in an extra move construct over auto&& x = some_func(); when some_func() returns a temporary object.

You shouldn't be.

If elision fails, it means your compiler is incompetent, or compiling with frankly hostile settings. And you cannot survive hostile settings or incompetent compilers: incompetent compilers can turn a+=b with integers into for (int i = 0; i < abs(b); ++i) {if (b>0) ++a; else --a;} and violate not one iota of the standard.

Elision is a core language feature. Don't write bad code just because you don't trust it will happen.


You should capture by reference when you want a reference to the data provided by the function, and not an independently stable copy of the data. If you don't understand the function's return value's lifetime, capturing by reference is simply not safe.

Even if you know that the data will be stable, more time is spent maintaining code than writing it: the person reading your code must be able to see at a glance that your assumption holds. And non-local bugs are bad: a seemingly innocuous change to the function you call should not break your code.


The end result is, take things by value unless you have a good reason not to.

Taking things by value makes it easier for you, and the compiler, to reason about your code. It increases locality.

When you have a good reason not to, then take things by reference.

Depending on context, this good reason might not have to be very strong. But it shouldn't be based off an assumption of an incompetent compiler.

Premature pessimization should be avoided, but so should premature optimization. Taking (or storing) references instead of values should be something you do when and if you have identified performance problems. Write clean, easy to understand code. Shove complexity into tightly written types, and have the exterior interface be clean and simple. Take things by value, because values decouple state.


Optimization is fungible. By making more of your code simpler, you can make it easier to work with (and be more productive). Then, when you identify parts where performance matters, you can expend effort there to make the code faster.

A big example is foundational code: foundational code (which is used everywhere) quickly becomes a general drag on performance if not written with performance and ease of use in mind. In that case, you want to hide complexity inside the type, and expose a simple easy to use exterior interface that doesn't require the user to understand the guts.

But code in a random function somewhere? Use values, the easiest to use containers with the friendliest O-notation to the most expensive operation you do and the easiest interface. Vectors if reasonable (avoid premature pessmiziation), but don't sweat a few maps.

Find the 1%-10% of your code that takes up 90%-99% of the time, and make that fast. If the rest of your code has good O-notation performance (so it won't become shockingly slower with larger data sets than you test with), you'll be in good shape. Then start testing with ridiculous data-sets, and find the slow parts then.

like image 130
Yakk - Adam Nevraumont Avatar answered Feb 16 '23 23:02

Yakk - Adam Nevraumont


Which side-effects could lead to the invalidation?

Holding a reference to the internal state of a class (i.e., versus extending the lifetime of a temporary) and then calling any non-const member functions may potentially invalidate the reference.

Is it bad-practice to assign return values to a const reference? (Especially when I don't know the interior of the function)

I would say the bad practice is to hold a reference to the internal state of a class instance, mutating the class instance, and continuing to use the original reference (unless it is documented that the non-const functions don't invalidate the reference)

like image 30
James Adkison Avatar answered Feb 16 '23 22:02

James Adkison