I wrote two function which do the same thing, but use different algorithms. I compare the execution time using clock() from time.h, but the result is inconsistent.
I have tried to change the sequence of execution of function, but it seems like the first function to be executed will always have the longer running time
#include <stdio.h>
#include <time.h>
long exponent(int a, int b);
long exponentFast(int a, int b);
int main(void)
{
int base;
int power;
clock_t begin;
clock_t end;
double time_spent;
base = 2;
power = 25;
begin = clock();
long result1 = exponentFast(base, power);
end = clock();
time_spent = (double)(end - begin) / CLOCKS_PER_SEC;
printf("Result1: %li, Time: %.9f\n", result1, time_spent);
begin = clock();
long result2 = exponent(base, power);
end = clock();
time_spent = (double)(end - begin) / CLOCKS_PER_SEC;
printf("Result2: %li, Time: %.9f\n", result2, time_spent);
}
long exponent(int a, int b)
{
...
}
long exponentFast(int a, int b)
{
...
}
I expect time_spent for result1 to have a lower value than that for result2, but the output is
Result1: 33554432, Time: 0.000002000
Result2: 33554432, Time: 0.000001000
Executing exponent() before exponentFast() also yield the same result, suggesting that my implementation of benchmarking is wrong.
It can be surprisingly difficult to perform timings on function calls like these accurately and significantly. Here's a modification of your program that illustrates the difficulties:
#include <stdio.h>
#include <time.h>
#include <math.h>
long exponent(int a, int b);
long exponentFast(int a, int b);
void tester(long (*)(int, int));
#define NTRIALS 1000000000
int main(void)
{
clock_t begin;
clock_t end;
double time_spent;
begin = clock();
tester(exponentFast);
end = clock();
time_spent = (double)(end - begin) / CLOCKS_PER_SEC;
printf("exponentFast: Time: %.9f = %.10f/call\n", time_spent, time_spent / NTRIALS);
begin = clock();
tester(exponent);
end = clock();
time_spent = (double)(end - begin) / CLOCKS_PER_SEC;
printf("exponent: Time: %.9f = %.10f/call\n", time_spent, time_spent / NTRIALS);
}
void tester(long (*func)(int, int))
{
int base = 2;
int power = 25;
int i;
unsigned long accum = 0;
for(i = 0; i < NTRIALS; i++) {
accum += (*func)(base, power);
base = (base + 1) % 5;
power = (power + 7) % 16;
}
printf("(accum = %lu)\n", accum);
}
long exponent(int a, int b)
{
return pow(a, b);
}
long exponentFast(int a, int b)
{
long ret = 1;
int i;
for(i = 0; i < b; i++)
ret *= a;
return ret;
}
You will notice that:
tester()
to do this. (tester()
therefore takes a pointer to the function being tested, which is a technique you may not have been familiar with yet.)(The second and third bullets follow suggestions by Jonathan Leffler, and are intended to ensure that a too-clever compiler doesn't optimize out some or all of the interesting work.)
When I ran this on my computer (an ordinary consumer-grade laptop), these were the results I got:
(accum = 18165558496053920)
exponentFast: Time: 20.954286000 = 0.0000000210/call
(accum = 18165558496053920)
exponent: Time: 23.409001000 = 0.0000000234/call
There are two huge things to notice here.
(Actually, however, this first trial was significantly misleading. More on that in a bit.)
Before continuing, it's worth asking a couple of questions.
To the first point, let's do a quick back-of-the-envelope analysis. My machine claims to have a 2.2 GHz CPU. That means (roughly speaking) that it can do 2.2 billion things per second, or about 0.45 nanoseconds per thing. So a function that's taking 21 nanoseconds can be doing roughly 21 ÷ 0.45 = 46 things. And since my sample exponentFast
function is doing roughly as many multiplications as the value of the exponent, it looks like we're probably in the right ballpark.
The other thing I did to confirm that I was getting at least quasi-reasonable results was to vary the number of trials. With NTRIALS
reduced to 100000000, the overall program took just about exactly a tenth of the time to run, meaning that the time per call was consistent.
Now, to point 2, I still remember one of my formative experiences as a programmer, when I wrote a new and improved version of a standard function which I just knew was going to be gobs faster, and after several hours spent debugging it to get it to work at all, I discovered that it wasn't measurably faster, until I increased the number of trials up into the millions, and the penny (as they say) dropped.
But as I said, the results I've presented so far were, by a funny coincidence, misleading. When I first threw together some simple code to vary the arguments given to the function calls, as shown above, I had:
int base = 2;
int power = 25;
and then, within the loop
base = (base + 1) % 5;
power = (power + 7) % 16;
This was intended to allow base
to run from 0 to 4, and power
from 0 to 15, with the numbers chosen to ensure that the result wouldn't overflow even when base
was 4. But this means that power
was, on average, only 8, meaning that my simpleminded exponentFast
call was only having to make 8 trips through its loop, on average, not 25 as in your original post.
When I changed the iteration step to
power = 25 + (power - 25 + 1) % 5;
-- that is, not varying base
(and therefore allowing it to remain as the constant 2) and allowing power
to vary between 25 and 30, now the time per call for exponentFast
rose to about 63 nanoseconds. The good news is that this makes sense (roughly three times as many iterations, on average, made it about three times slower), but the bad news is that it looks like my "exponentFast
" function isn't very fast! (Obviously, though, I didn't expect it to be, with its simpleminded, brute-force loop. If I wanted to make it faster, the first thing I'd do, at little additional cost in complexity, would be to apply "binary exponentiation".)
There's at least one more thing to worry about, though, which is that if we call these functions one billion times, we're not only counting one billion times the amount of time each function takes to do its work, but also one billion times the function call overhead. If the function call overhead is on a par with the amount of work the function is doing, we will (a) have a hard time measuring the actual work time, but also (b) have a hard time speeding things up! (We could get rid of the function call overhead by inlining the functions for our test, but that obviously wouldn't be meaningful if the actual use of the functions in the end program were going to involve real function calls.)
And then yet one more inaccuracy is that we're introducing a timing artifact by computing new and different values of base
and/or power
for each call, and adding up all the results, so that the amortized time to do that work goes into what we've been calling "time per call". (This problem, at least, since it affects either exponentiation function equally, won't detract from our ability to assess which one, if either, is faster.)
Addendum: Since my initial exponent of "exponentFast
" really was pretty embarrassingly simpleminded, and since binary exponentiation is so easy and elegant, I performed one more test, rewriting exponentFast
as
long exponentFast(int a, int b)
{
long ret = 1;
long fac = a;
while(1) {
if(b & 1) ret *= fac;
b >>= 1;
if(b == 0) break;
fac *= fac;
}
return ret;
}
Now -- Hooray! -- the average call to exponentFast
goes down to about 16 ns per call on my machine. But the "Hooray!" is qualified. It's evidently about 25% faster than calling pow()
, and that's nicely significant, but not an order of magnitude or anything. If the program where I'm using this is spending all its time exponentiating, I'll have made that program 25% faster, too, but if not, the improvement will be less. And there are cases where the improvement (the time saved over all anticipated runs of the program) will be less than the time I spent writing and testing my own version. And I haven't yet spent any time doing proper regression tests on my improved exponentFast
function, but if this were anything other than a Stack Overflow post, I'd have to. It's got several sets of edge cases, and might well contain lurking bugs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With