Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are function arguments laid out in memory the same way as structs?

Tags:

c++

The question arises from the following idea. There is a function, that acts as a proxy/hook of a true function, e.g.

int foo(int a, float b, void* c, std::string d, int& e, int f) {
    // possible bookkeeping here
    // ...

    // notice missing "a" argument, passing a *tail* of all arguments
    return foo_impl(b, c, d, e, f);
}

Can't change foo's interface, can change the foo_impl.

Now, I would like to call foo_impl with minimal overhead, in particular with saving on argument copying, e.g.

struct FooArgs {
   float b; 
   void* c; 
   std::string d;
   int& e;
   int f;
};

int foo_impl(FooArgs* args);

int foo(int a, float b, void* c, std::string d, int& e, int f) {
    // ...
    auto args = FooArgs{b, c, d, e, f};
    return foo_impl(&args);
}

No luck, still need to pack all arguments into FooArgs.

It should be possible to acquire a pointer to foo::b argument, and reinterpret it as FooArgs*

int foo(int a, float b, void* c, std::string d, int& e, int f) {
    // ...
    return foo_impl(reinterpret_cast<FooArgs*>(&b));
}

I haven't tried it in action, but I smell undefined behavior. Is it the case?

like image 426
Alexey Larionov Avatar asked Dec 21 '25 06:12

Alexey Larionov


1 Answers

In most cases with functions with reasonable numbers of arguments, all of the arguments will be passed in registers on modern general purpose system calling conventions, and none will directly be passed by memory. For Microsoft AMD64 the first 4 non-floating point, and first 4 floating point arguments will be passed by register. For SystemV AMD64 the first 6 non-floating point and first 8 floating point arguments will be passed by register. For ARM64 the first 8 non-floating point and first 8 floating point arguments are passed by register.

On x86-64, struct and class arguments 'passed by register' are stored on the stack, but the exact arrangement is up to the compilers convenience at that specific call site, with a pointer to the location passed by register allowing the value to be passed down efficiently so long as there is no need to create a copy.

On ARM, struct and class arguments are flattened to a list of fields, before constructing the list of arguments, with the first 8 non-FP and FP items from those lists of fields being passed by register, so a class with a lot of fields could quickly use all the register arguments.

As such, if you are taking one argument off the start of the function arguments, and absolutely need to optimize to avoid the register overhead, the minimum option is a combination of two things: Have foo_impl take all class parameters by rvalue reference so the copy constructor does not need to be invoked. And two, pass the unneeded parameter anyway. It is cheaper to pass the parameter and let foo_impl discard it, than it is to not pass it, and require foo to rearrange the arguments to fit.

However this is an extreme micro optimisation and only saves you up to 8 movs in the worst case. So I would only do it after profiling has identified this as a significant hotspot. I would also recommend reading the generated assembly if you are in this situation to check that the calling convention is interacting in the way you expect.

Your idea of trying to get access to a pointer to b is undefined behavior, would not work, and even if the compiler made it work, would likely result in an even slower call than the original mechanism as the register arguments would need to be spilled to the stack and then read back in again in foo_impl. This would involve a bunch of L1 accesses which are more expensive than the register-register moves needed by the naive approach.

like image 175
user1937198 Avatar answered Dec 23 '25 21:12

user1937198