Assume I have a compilation unit consisting of three functions, A, B, and C. A is invoked once from a function external to the compilation unit (e.g. it's an entry point or callback); B is invoked many times by A (e.g. it's invoked in a tight loop); C is invoked once by each invocation of B (e.g. it's a library function).
The entire path through A (passing through B and C) is performance-critical, though the performance of A itself is non-critical (as most time is spent in B and C).
What is the minimal set of functions which one should annotate with __attribute__ ((hot))
to effect more aggressive optimization of this path? Assume we cannot use -fprofile-generate
.
Equivalently: Does __attribute__ ((hot))
mean "optimize the body of this function", "optimize calls to this function", "optimize all descendant calls this function makes", or some combination thereof?
The GCC info page does not clearly address these questions.
hot The hot attribute on a function is used to inform the compiler that the function is a hot spot of the compiled program. The function is optimized more aggressively and on many target it is placed into special subsection of the text section so all hot functions appears close together improving locality.
The __attribute__ directive is used to decorate a code declaration in C, C++ and Objective-C programming languages. This gives the declared code additional attributes that would help the compiler incorporate optimizations or elicit useful warnings to the consumer of that code.
The naked storage-class attribute is a Microsoft-specific extension to the C language. For functions declared with the naked storage-class attribute, the compiler generates code without prolog and epilog code. You can use this feature to write your own prolog/epilog code sequences using inline assembler code.
Official documentation:
hot
The hot attribute on a function is used to inform the compiler that the function is a hot spot of the compiled program. The function is optimized more aggressively and on many target it is placed into special subsection of the text section so all hot functions appears close together improving locality. When profile feedback is available, via -fprofile-use, hot functions are automatically detected and this attribute is ignored.The hot attribute on functions is not implemented in GCC versions earlier than 4.3.
The hot attribute on a label is used to inform the compiler that path following the label are more likely than paths that are not so annotated. This attribute is used in cases where __builtin_expect cannot be used, for instance with computed goto or asm goto.
The hot attribute on labels is not implemented in GCC versions earlier than 4.8.
2007:
__attribute__((hot))
Hint that the marked function is "hot" and should be optimized more aggresively and/or placed near other "hot" functions (for cache locality).
Gilad Ben-Yossef:
As their name suggests, these function attributes are used to hint the compiler that the corresponding functions are called often in your code (hot) or seldom called (cold).
The compiler can then order the code in branches, such as if statements, to favour branches that call these hot functions and disfavour functions cold functions, under the assumption that it is more likely that that the branch that will be taken will call a hot function and less likely to call a cold one.
In addition, the compiler can choose to group together functions marked as hot in a special section in the generated binary, on the premise that since data and instruction caches work based on locality, or the relative distance of related code and data, putting all the often called function together will result in better caching of their code for the entire application.
Good candidates for the hot attribute are core functions which are called very often in your code base. Good candidates for the cold attribute are internal error handling functions which are called only in case of errors.
So, according to these sources, __attribute__ ((hot))
means:
.hot
section (to group all hot code in one location)After source code analysis we can say that "hot" attribute is checked with (lookup_attribute ("hot", DECL_ATTRIBUTES (current_function_decl))
; and when it is true, the functions's node->frequency
is set to NODE_FREQUENCY_HOT
(predict.c, compute_function_frequency()).
If the function has frequency as NODE_FREQUENCY_HOT
,
If there is no profile information and no likely/unlikely
on branches, maybe_hot_frequency_p
will return true for the function (== "...frequency FREQ is considered to be hot."). This turns value of maybe_hot_bb_p
into true for all Basic Blocks (BB) in the function ("BB can be CPU intensive and should be optimized for maximal performance.") and maybe_hot_edge_p
true for all edges in function. In turn in non -Os
-modes these BB and edges and also loops will be optimized for speed, not for size.
For all outbound call edges from this function, cgraph_maybe_hot_edge_p
will return true ("Return true if the call can be hot."). This flag is used in IPA (ipa-inline.c, ipa-cp.c, ipa-inline-analysis.c) and influence inline and cloning decisions
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With