std::function
is known to have performance issues because it may do heap allocations. Admitted, if you are being 100% honest, one heap allocation should hardly be a problem in most cases... but let's just assume doing a heap allocation is undesirable or forbidding in a particular scenario. Maybe we're doing a few million callbacks and don't want a few million heap allocations for that, whatever.
So... we want to avoid that heap allocation.
The Dr. Dobbs article Efficient Use of Lambda Expressions and std::function gives a recommendation on optimizing the use of std::function
by taking advantage of the small object optimization that is recommended by the standard and implemented in every mainstream standard library.
The article goes into length explaining how the standard library must copy the functor since the std::function
object might outlive the original functor (though you can use std::ref
if you are sure it doesn't), which would be bad mojo. Also, captures need to be copied, and here is the problem: The exact type of closure (or its size) is not known beforehand as it could be any type of closure with any number of captures, so some compromise must be made. Up to a certain size, the captures will be saved in a store inside the function
object, and beyond that, it will be dynamically allocated. The store is small, anywhere from 12 to 16 bytes, so assuming a 64-bit build, a maximum of two pointers (not counting the actual function pointer).
Dr. Dobbs thus recommends (and several other sites pick up that advice, seemingly without much of an objection) capturing a reference to a struct that holds references to what you actually want to capture. That way, you only capture one reference, which is just perfect, since it will always fit into the small object store.
How does that work? The assumption which made copying stuff around necessary in the first place was that the function
object may outlive the scope of the original closure. Which means, of course, that it also outlives the structure that it holds a reference to, as well as anything referenced from inside that struct.
How is this supposed to work? And since I can't see how it could possibly work, is there a better well-known recipe to address this? (one that doesn't reference invalid objects)
I don't think it's supposed to work if the function object does outlive its calling function (and you're capturing references to objects that are on the stack).
In many practical cases the function object is used locally and will not outlive its caller and then you can avoid the heap allocation (but then again, the compiler might be able to optimize the references and the entire struct
technique is probably not necessary).
Here's a simple test which compiles but crashes (tested on clang in C++14 mode.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With