Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Passing inline functions as arguments

Tags:

c++

c++11

I'm wondering if C++ will still obey the inline keyword when a function is passed as an agument. In the following example, would a new frame for onFrame be pushed onto the stack every time frame() is called in the while loop?

bool interrupt = false;

void run(std::function<void()> frame) {
    while(!interrupt) frame();
}

inline void onFrame() {
    // do something each frame
}

int main() {
    run(onFrame);
}

Or would changing to this have any effect?

void run(std::function<inline void()> frame) {
    while(!interrupt) frame();
}

If you have no definitive answer, can you help me find a way to test this? Possibly using memory addresses or some sort of debugger?

like image 601
Evan Kennedy Avatar asked Mar 29 '16 16:03

Evan Kennedy


2 Answers

It's going to be pretty hard for the compiler to inline your function if it has to go through std::function's type-erased dispatch to get there. It's possible it'll happen anyway, but you're making it as hard as possible. Your proposed alternative (taking a std::function<inline void()> argument) is ill-formed.

If you don't need type erasure, don't use type erasure. run() can simply take an arbitrary callable:

template <class F>
void run(F frame) {
    while(!interrupt) frame();
}

That is muuch easier to inline for the compiler. Although, simply having an inline function does not in of itself guarantee that the function gets inlined. See this answer.

Note also that when you're passing a function pointer, that also makes it less likely to get inlined, which is awkward. I'm trying to find an answer on here that had a great example, but until then, if inlining is super important, wrapping it in a lambda may be the way to go:

run([]{ onFrame(); });
like image 74
Barry Avatar answered Nov 14 '22 22:11

Barry


still obey the inline keyword ... would a new frame ... be pushed onto the stack

That isn't what the inline keyword does in the first place (see this question for extensive reference).


Assuming, as Barry does, that you're hoping to persuade the optimiser to inline your function call (once more for luck: this is nothing to do with the inline keyword), function template+lambda is probably the way to go.

To see why this is, consider what the optimiser has to work with in each of these cases:

  1. function template + lambda

    template <typename F>
    void run(F frame) { while(!interrupt) frame(); }
    
    // ... call site ...
    run([]{ onFrame(); });
    

    here, the function only exists at all (is instantiated from the template) at the call site, with everything the optimizer needs to work in scope and well-defined.

    Note the optimizer may still reasonably choose not to inline a call if it thinks the extra instruction cache pressure will outweigh the saving of stack frame

  2. function pointer

    void run(void (*frame)()) { while(!interrupt) frame(); }
    
    // ... call site ...
    run(onFrame);
    

    here, run may have to be compiled as a standalone function (although that copy may be thrown away by the linker if it can prove no-one used it), and same for onFrame, especially since its address is taken. Finally, the optimizer may need to consider whether run is called with many different function pointers, or just one, when deciding whether to inline these calls. Overall, it seems like more work, and may end up as a link-time optimisation.

    NB. I used "standalone function" to mean the compiler likely emits the code & symbol table entry for a normal free function in both cases.

  3. std::function

    This is already getting long. Let's just notice that this class goes to great lengths (the type erasure Barry mentioned) to make the function

    void run(std::function<void()> frame);
    

    not depend on the exact type of the function, which means hiding information from the compiler at the point it generates the code for run, which means less for the optimiser to work with (or conversely, more work required to undo all that careful information hiding).


As for testing what your optimiser does, you need to examine this in the context of your whole program: it's free to choose different heuristics depending on code size and complexity.

To be totally sure what it actually did, just disassemble with source or compile to assembler. (Yes, that's potentially a big "just", but it's platform-specific, not really on-topic for the question, and a skill worth learning anyway).

like image 23
Useless Avatar answered Nov 14 '22 22:11

Useless