Compile with g++.exe -m64 -std=c++17
and any optimization level, and run:
#include <iostream>
int main() {
const auto L1 = [&](){};
std::cout << sizeof(L1) << std::endl;
const auto L2 = [&](){L1;};
std::cout << sizeof(L2) << std::endl;
const auto L3 = [&](){L1, L2;};
std::cout << sizeof(L3) << std::endl;
const auto L4 = [&](){L1, L2, L3;};
std::cout << sizeof(L4) << std::endl;
}
The output is 1,8,16,24
, which means that L2 contains 1 reference, L3 contains 2 and L4 contains 3.
However, given the same function "[&](){L1, L2;}
in main()
", the value of &L1 - &L2
should be fixed, and to use L1
with a pointer to L2
, there's direct addressing in x86 [rbx+const]
assuming rbx=&L2
. Why does GCC still choose to include every reference in the lambda?
I think this is a missed optimization, so you could report it as a gcc bug on https://gcc.gnu.org/bugzilla/. Use the missed-optimization keyword.
A capturing lambda isn't a function on it own, and can't decay/convert to a function pointer, so I don't think there's any required layout for the lambda object. (Use a lambda as a parameter for a C++ function). The generated code that reads the lambda object will always be generated from the same compilation unit that defined it. So it sounds plausible that it just needs one base pointer for all locals, with offsets from that.
Other captures of variables with storage class other than automatic might still need separate pointers, if their offsets from each other weren't compile-time or at least link-time constants. (Or that could be a separate optimization.)
You can actually get the compiler to use the space and create a lambda object in memory by passing the lambda to a __attribute__((noinline))
template function. https://godbolt.org/z/Pt0SCC.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With