I'm playing through project Euler in my spare time, and it's come to the point where I need to do some refactoring. I've implemented Miller-Rabin, as well as a few sieves. I've heard before that sieves are actually faster for small-ish numbers, as in under a few million. Does anybody have any information on this? Google wasn't very helpful.
Yes, you'll find with most algorithms that you can trade space for time. In other words, by allowing the use of more memory, the speed is greatly increased *a.
I don't actually know the Miller-Rabin algorithm but, unless it's simpler than a single shift-left/add and memory extraction, it will be blown out of the water by a pre-calculated sieve.
The important thing here is pre-calculated. It's a good idea, in terms of performance, to pre-calculate things like this since the first million primes will be unlikely to change in the near future :-)
In other words, create your sieve with something like:
unsigned char primeTbl[] = {0,0,1,1,0,1,0,1,0,0,0,1};
#define isPrime(x) ((x < sizeof(primeTbl) ? primeTbl[x] : isPrimeFn(x))
with all the usual caveats about not passing things like a++
into macros. This gives you the best of both worlds, a blindingly fast table lookup for "small-ish" primes, dropping back to a calculation method for those outside the range.
Obviously you would write a program using one of the other methods to generate that lookup table - you don't really want to have to type it all in by hand.
But, as with all optimisation questions, measure, don't guess!
*a A classic case of this was some trig functions I once had to write for an embedded system. This was a competitive contract bid and the system had a little more storage than CPU grunt.
We actually won the contract since our benchmark figures for the functions blew the competition away.
Why? Because we pre-calculated the values into a lookup table originally calculated on another machine. By judicious use of reduction (bringing the input values down below 90 degrees) and trig properties (the fact that cosine is just a phase shift of sine and that the other three quadrants are related to the first), we got the lookup table down to 180 entries (one per half degree).
The best solutions are those that are elegant and devious :-)
For what it's worth, the following C code will generate such a table for you, all the primes below four million (283,000 of them).
#include <stdio.h>
static unsigned char primeTbl[4000000];
int main (void) {
int i, j;
for (i = 0; i < sizeof(primeTbl); i++)
primeTbl[i] = 1;
primeTbl[0] = 0;
primeTbl[1] = 0;
for (i = 2; i < sizeof(primeTbl); i++)
if (primeTbl[i])
for (j = i + i; j < sizeof(primeTbl); j += i)
primeTbl[j] = 0;
printf ("static unsigned char primeTbl[] = {");
for (i = 0; i < sizeof(primeTbl); i++) {
if ((i % 50) == 0) {
printf ("\n ");
}
printf ("%d,", primeTbl[i]);
}
printf ("\n};\n");
printf ("#define isPrime(x) "
"((x < sizeof(primeTbl) ? primeTbl[x] : isPrimeFn(x))\n");
return 0;
}
If you can bump up the primeTbl
table to sixteen million entries (16M), you'll find that's enough to keep the prime count above a million (the first 1,031,130 primes).
Now there are ways to make that take less storage such as only storing odd numbers and adjusting the macro to take care of that, or using a bit mask instead of unsigned characters. I prefer simplicity of algorithms myself if the memory is available.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With