Beating the dead horse here. A typical (and fast) way of doing integer powers in C is this classic: <pre class="prettyprint"><code>int64_t ipow(int64_t base, int exp){ int64_t result = 1; while(exp){ if(exp & 1) result *= base; exp >>= 1; base *= base; } return result; } </code></pre> However I needed a compile time integer power so I went ahead and made a recursive implementation using constexpr: <pre class="prettyprint"><code>constexpr int64_t ipow_(int base, int exp){ return exp > 1 ? ipow_(base, (exp>>1) + (exp&1)) * ipow_(base, exp>>1) : base; } constexpr int64_t ipow(int base, int exp){ return exp < 1 ? 1 : ipow_(base, exp); } </code></pre> The second function is only to handle exponents less than 1 in a predictable way. Passing <code>exp<0</code> is an error in this case. <h3>The recursive version is 4 times slower</h3> I generate a vector of 10E6 random valued bases and exponents in the range [0,15] and time both algorithms on the vector (after doing a non-timed run to try to remove any caching effects). Without optimization the recursice method is twice as fast as the loop. But with -O3 (GCC) the loop is 4 times faster than the recursice method. My question to you guys is this: Can any one come up with a faster ipow() function that handles exponent and bases of 0 and can be used as a <code>constexpr</code>? (Disclaimer: I don't need a faster ipow, I'm just interested to see what the smart people here can come up with).

A good optimizing compiler will transform tail-recursive functions to run as fast as imperative code. You can transform this function to be tail recursive with pumping. GCC 4.8.1 compiles this test program: <pre class="prettyprint"><code>#include <cstdint> constexpr int64_t ipow(int64_t base, int exp, int64_t result = 1) { return exp < 1 ? result : ipow(base*base, exp/2, (exp % 2) ? result*base : result); } int64_t foo(int64_t base, int exp) { return ipow(base, exp); } </code></pre> into a loop (See this at gcc.godbolt.org): <pre class="prettyprint"><code>foo(long, int): testl %esi, %esi movl $1, %eax jle .L4 .L3: movq %rax, %rdx imulq %rdi, %rdx testb $1, %sil cmovne %rdx, %rax imulq %rdi, %rdi sarl %esi jne .L3 rep; ret .L4: rep; ret </code></pre> vs. your while loop implementation: <pre class="prettyprint"><code>ipow(long, int): testl %esi, %esi movl $1, %eax je .L4 .L3: movq %rax, %rdx imulq %rdi, %rdx testb $1, %sil cmovne %rdx, %rax imulq %rdi, %rdi sarl %esi jne .L3 rep; ret .L4: rep; ret </code></pre> Instruction-by-instruction identical is good enough for me.

c++11 fast constexpr integer powers

Q: How do you write 10 to the power in C++?

pow() is function to get the power of a number, but we have to use #include<math. h> in c/c++ to use that pow() function. then two numbers are passed. Example – pow(4 , 2); Then we will get the result as 4^2, which is 16.

Tags:

c++

optimization

c++11

recursion

constexpr

Beating the dead horse here. A typical (and fast) way of doing integer powers in C is this classic:

int64_t ipow(int64_t base, int exp){
  int64_t result = 1;
  while(exp){
    if(exp & 1)
      result *= base;
    exp >>= 1;
    base *= base;
  }
  return result;
}

However I needed a compile time integer power so I went ahead and made a recursive implementation using constexpr:

constexpr int64_t ipow_(int base, int exp){
  return exp > 1 ? ipow_(base, (exp>>1) + (exp&1)) * ipow_(base, exp>>1) : base;
}
constexpr int64_t ipow(int base, int exp){
  return exp < 1 ? 1 : ipow_(base, exp);
}

The second function is only to handle exponents less than 1 in a predictable way. Passing exp<0 is an error in this case.

The recursive version is 4 times slower

I generate a vector of 10E6 random valued bases and exponents in the range [0,15] and time both algorithms on the vector (after doing a non-timed run to try to remove any caching effects). Without optimization the recursice method is twice as fast as the loop. But with -O3 (GCC) the loop is 4 times faster than the recursice method.

My question to you guys is this: Can any one come up with a faster ipow() function that handles exponent and bases of 0 and can be used as a constexpr?

(Disclaimer: I don't need a faster ipow, I'm just interested to see what the smart people here can come up with).

265

asked Jul 18 '13 09:07

Emily L.

2 Answers

A good optimizing compiler will transform tail-recursive functions to run as fast as imperative code. You can transform this function to be tail recursive with pumping. GCC 4.8.1 compiles this test program:

#include <cstdint>

constexpr int64_t ipow(int64_t base, int exp, int64_t result = 1) {
  return exp < 1 ? result : ipow(base*base, exp/2, (exp % 2) ? result*base : result);
}

int64_t foo(int64_t base, int exp) {
  return ipow(base, exp);
}

into a loop (See this at gcc.godbolt.org):

foo(long, int):
    testl   %esi, %esi
    movl    $1, %eax
    jle .L4
.L3:
    movq    %rax, %rdx
    imulq   %rdi, %rdx
    testb   $1, %sil
    cmovne  %rdx, %rax
    imulq   %rdi, %rdi
    sarl    %esi
    jne .L3
    rep; ret
.L4:
    rep; ret

vs. your while loop implementation:

ipow(long, int):
    testl   %esi, %esi
    movl    $1, %eax
    je  .L4
.L3:
    movq    %rax, %rdx
    imulq   %rdi, %rdx
    testb   $1, %sil
    cmovne  %rdx, %rax
    imulq   %rdi, %rdi
    sarl    %esi
    jne .L3
    rep; ret
.L4:
    rep; ret

Instruction-by-instruction identical is good enough for me.

168

answered Oct 04 '22 05:10

Casey

It seems that this is a standard problem with constexpr and template programming in C++. Due to compile time constraints, the constexpr version is slower than a normal version if executed at runtime. But overloading doesn't allows to chose the correct version. The standardization committee is working on this issue. See for example the following working document http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2013/n3583.pdf

answered Oct 04 '22 06:10

hivert

Related questions
                            
                                Why no short overloads for std::to_string? Why no noexcept?
                            
                                user's arguments are empty with QCoreApplication in mysterious cases
                            
                                Thrift: Is it possible to do only serialization with C++ Thrift library?
                            
                                IDE for realtime collaboration that works with C/C++, C#, .Net [closed]
                            
                                Does deleting void pointer guarantee to delete right size? [duplicate]
                            
                                std::function as a custom stream manipulator
                            
                                How to handle unicode values in JSON strings?
                            
                                Why is memcpy slower than a reinterpret_cast when parsing binary data?
                            
                                To use DefWindowProc or not to use DefWindowProc
                            
                                up-casting std::shared_ptr using std::dynamic_pointer_cast
                            
                                'Excess elements in struct initializer' error with C++11 uniform initialization
                            
                                What does LL mean?
                            
                                Avoid angle brackets in default template
                            
                                Enable std::get support on class
                            
                                C++ Is it possible to have a generic function pointer?
                            
                                Get last function called in C/C++
                            
                                Passing std::array as arguments of template variadic function
                            
                                Difference between new operator in C++ and new operator in java
                            
                                C++ type converting issue
                            
                                Does QMap support custom comparator functions?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With