How to force GCC to pass 128bits/256bits struct as function param in xmm/ymm register? ie. if my struct is 256bits wide (UnsignedLongLongStruct below)
(I know if I use intrinsics to make a packed integer, gcc is smart enough to put it into %ymm register, but can I do it with struct ?)
typedef struct {
unsigned long long ull1;
unsigned long long ull2;
unsigned long long ull3;
unsigned long long ull4;
} UnsignedLongLongStruct;
void func1( UnsignedLongLongStruct unsignedLongLongStruct ) {
....
}
TL;DR: It seems the calling conventions explicitly mention __m256 and friends to be placed in the umm regs.
In X86-64 System V ABI, point 3.2.3, you can check how parameters are passed. My reading is that only __m256
arguments will be turned into one SSE and 3 SSEUP 8-byte chunks, which allows them to be passed in a ymm register.
This will make it so that your argument gets passed in memory, which is what we see in clang, gcc, and icc: Test program on godbolt
In order to pass it as a register, as I read the calling conventions, it seem that you have to pass it as a __m256 (or a variant of it).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With