In the paper Loop Recognition in C++/Java/Go/Scala (pdf) we find the following quote in the section C++ Tunings:
Structure Peeling. The structure
UnionFindNode
has 3 cold fields:type_
,loop_
, andheader_
. Since nodes are allocated in an array, this is a good candidate for peeling optimization. The three fields can be peeled out into a separate array. Note theheader_
field is also dead – but removing it has very little performance impact. Thename_
field in theBasicBlock
structure is also dead, but it fits well in the padding space so it is not removed.
Can some explain to me what cold/dead fields are, and what a peeling optimization is (I understand what the author did there, but what is the rationale behind it)?
Structure peeling
is an optimization where you divide a structure into several ones to improve data locality (in order to reduce cache misses). You separate "hot" data (frequently accessed) from "cold" data (seldomly accessed) into two structures to improve the efficiency of the cache, by maximizing the probability of cache hits.
In the article, the authors decided to move the type_
, loop_
and header_
fields away from the more frequently accessed fields.
For more information, you can have a look at this scientific article about structure layout optimization, which contains a description of structure peeling among other techniques: Structure Layout Optimizations in the Open64 Compiler: Design, Implementation and Measurements
If you have access to the ACM digital library, you can also download Practical structure layout optimization and Advice.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With