Testing a code in old gcc-4.4.0
and gcc-4.6.4
, compiler was able to apply a smart optimization and pre-calculate the result for const
inputs:
#include <iostream>
#include <chrono>
using namespace std;
const auto N = 1000000000ULL; // constexptr is tested, no effect
unsigned long long s(unsigned long long n)
{
auto s = 0ULL;
for (auto i = 0ULL; i < n; i++)
s += i;
return s;
}
int main()
{
auto t1 = std::chrono::high_resolution_clock::now();
auto x = s(N);
auto t2 = std::chrono::high_resolution_clock::now();
auto t = std::chrono::duration_cast<std::chrono::nanoseconds>(t2-t1).count();
cout << "Result: " << x << " -- time (ms):" << t/0.1e7 << endl;
}
N
is a constant value, then compiler can run function s
in compile-time and assign the result to x
. (No run-time calculation is needed for N
)
Results in different versions of gcc (and also a version of clang):
0.001532 ms
.0.013517 ms
.0.001 ms
.1313.78 ms
!!.Question:
Note(1): I tested both -O2
and -O3
switches, no effect.
Note(2): Forcing, I mean compiler's commands and switches.
Note(3): Function s
is just an example, it can be replaced by more complicated functions.
Status of Experimental C++11 Support in GCC 4.8 GCC provides experimental support for the 2011 ISO C++ standard. This support can be enabled with the -std=c++11 or -std=gnu++11 compiler options; the former disables GNU extensions.
To see if your compiler has C++11 support, run it with just the --version option to get a print out of the version number. Do this for whichever compiler(s) you wish to use with Rosetta. Acceptable versions: GCC/g++: Version 4.8 or later.
According to cppreference, full support of c++11 came with gcc 4.8.
I've submitted it as a bug. Yes, it's a Regression in version 4.8 which is fixed in newer revisions 5 weeks ago. Follow it here:
You can FORCE it to run at compile-time using the new C++11 constexpr
keyword.
First you must transform iteration into recursion (this requirement is removed in C++1y), for example:
constexpr unsigned long long s(unsigned long long n)
{
return n? n + s(n-1): 0;
}
Or with tail recursion (still works well for run-time computation when the input is variable):
constexpr unsigned long long s_impl( unsigned long long accum, unsigned long long n, unsigned long long n_max )
{
return (n < n_max)? s_impl(accum + n + 1, n + 1, n_max): accum;
}
constexpr unsigned long long s(unsigned long long n)
{
return s_impl(0, 0, n);
}
(In C++1y, all you'd need to do is add the constexpr
keyword to the existing implementation)
Then invoke it with
constexpr auto x = s(N);
The C++11 way to deal with computations at compile-time is the use of constexpr
. Sadly, constexpr
functions are somewhat limited in what can be done. In C++11, a constexpr
function is allowed to contain empty statements, static_assert()
declarations, typedef
s, and using
declarations/directives, and exactly one return
-statement (I got temporarily confused because I was looking at the C++14 draft which has the rules relaxed). That is, you'd need to formulate your function recursively. On the plus side, if a constexpr
function is called with a constant expression, it will be evaluated at compile-time.
Other than that, you might want to help out the compiler with its constant folding. For example, it could help to
s()
an inline
functions.N
as constexpr unsigned long long N = 1000000000ULL;
Is this optimization omitted in 4.8.1?
It looks like it is gone. It is still present in 4.7.2 though.
Why? [From one of your comments:] I think that optimization was excellent and doesn't hurt anything.
It is most likely accidental and the gcc developers don't know about it.
I can think of a good reason why I would want to at least provide an upper bound on this optimization. I got bitten by MSVC back in 2009: When I gave it a machine generated C code it was trying to optimize it and the compiler struggled with it for minutes. Obviously, it was desperately trying to do some optimization which should have been limited in some way so that the compiler wouldn't struggle for minutes over a 7KB source file. My point is: You may want to limit optimizations that can potentially increase your compile times too much.
However it doesn't seem to be the case here. I have tried it with fairly small N
s and this optimization is not performed either.
If it's omitted, how can I force the compiler to do this pre-calculation?
Note(2): Forcing, I mean compiler's commands and switches
I couldn't trick gcc 4.8.1 into doing this optimization. I will submit a bugreport if nobody says soon that it is a known issue or it can be enabled with some compiler flag.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With