This is related to Determine cause of segfault when using -O3? In the question, I'm catching a segfault in a particular function when compiled with -O3
using a particular version of GCC. At -O3
, vectorization instructions are used (at -O2
, they are not used).
I want to wrap a single function in a lower optimization level. According to Switching off optimization for a specific function in GCC 4.2.2, I can do it. However, following the various links in the question and answers, I don't find an answer for "how, exactly, to do it".
How do I mark a single function to use a different optimization level?
Related, I don't want to move this function to a separate file, and then provide a different makefile recipe for it. Doing that opens another can of worms, like applying it to GCC 4.9 only on some platforms.
An optimization level is chosen with the command line option -O LEVEL , where LEVEL is a number from 0 to 3. The effects of the different optimization levels are described below: -O0 or no -O option (default)
The /O1 option sets the individual optimization options that create the smallest code in the majority of cases. The /O2 option sets the options that create the fastest code in the majority of cases. The /O2 option is the default for release builds.
Optimization level -O3 -O3 instructs the compiler to optimize for the performance of generated code and disregard the size of the generated code, which might result in an increased code size. It also degrades the debug experience compared to -O2 .
I know this question is tagged as GCC, but I was just looking into doing this portably and thought the results may come in handy for someone, so:
optimize(X)
function attributeoptnone
and minsize
function attributes (use __has_attribute
to test for support). Since I believe 3.5 it also has #pragma clang optimize on|off
.#pragma intel optimization_level 0
which applies to the next function after the pragma#pragma optimize
, which applies to the first function after the pragma#pragma option_override(funcname, "opt(level,X)")
. Note that 13.1.6 (at least) returns true for __has_attribute(optnone)
but doesn't actually support it.#pragma Onum
, which can be coupled with #pragma push/pop
#pragma opt X (funcname)
#pragma _CRI [no]opt
#pragma FUNCTION_OPTIONS(func,"…")
(C) and #pragma FUNCTION_OPTIONS("…")
(C++)#pragma optimize=...
#pragma optimize time/size/none
So, for GCC/ICC/MSVC/clang/IAR/Pelles and TI C++, you could define a macro that you just put before the function. If you want to support XL, ODS, and TI C you could add the function name as an argument. ARM would require another macro after the function to pop the setting. For Cray AFAIK you can't restore the previous value, only turn optimization off and on.
I think the main reason for this is to disable optimizations for a buggy compiler (or a compiler which exposes bugs in your code), so a unified portable experience probably isn't critical, but hopefully this list helps someone find the right solution for their compiler.
Edit: It's also worth noting that it's relatively common to disable optimizations because code which was working before no longer does. While it's possible that there is a bug in the compiler, it's much more likely that your code was relying on undefined behavior and newer, smarter compilers can and will elide the undefined case. The right answer in situations like this is not to disable optimizations, but instead to fix your code. UBsan on clang and gcc can help a lot here; compile with -fsanitize=undefined
and lots of undefined behavior will start emitting warnings at runtime. Also, try compiling with all the warning options you can enabled; for GCC that means -Wall -Wextra
, for clang throw in -Weverything
.
It's described in https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes
You can change the level by declaring the function like this:
void some_func() __attribute__ ((optimize(1))) {
....
}
To force optimization level 1 for it.
Here is how to do it with pragmas:
#pragma GCC push_options
#pragma GCC optimize ("-O2")
void xorbuf(byte *buf, const byte *mask, size_t count)
{
...
}
#pragma GCC pop_options
To make it portable, something like the following.
#define GCC_OPTIMIZE_AWARE (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 7)) || defined(__clang__)
#if GCC_OPTIMIZE_AWARE
# pragma GCC push_options
# pragma GCC optimize ("-O2")
#endif
It needs to be wrapped because with -Wall
, older version of GCC don't understand -Wno-unknown-pragma
, and they will cause a noisy compile. Older version will be encountered in the field, like GCC 4.2.1 on OpenBSD.
But according to Markus Trippelsdorf on When did 'pragma optimize' become available? from the GCC mailing list:
This is a bad idea in general, because "pragma GCC optimize" is meant as a compiler debugging aid only. It should not be used in production code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With