Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Improving a prime sieve algorithm

I'm trying to make a decent Java program that generates the primes from 1 to N (mainly for Project Euler problems).

At the moment, my algorithm is as follows:

Initialise an array of booleans (or a bitarray if N is sufficiently large) so they're all false, and an array of ints to store the primes found.

Set an integer, s equal to the lowest prime, (ie 2)

While s is <= sqrt(N)

Set all multiples of s (starting at s^2) to true in the array/bitarray.

Find the next smallest index in the array/bitarray which is false, use that as the new value of s.

Endwhile.

Go through the array/bitarray, and for every value that is false, put the corresponding index in the primes array.

Now, I've tried skipping over numbers not of the form 6k + 1 or 6k + 5, but that only gives me a ~2x speed up, whilst I've seen programs run orders of magnitudes faster than mine (albeit with very convoluted code), such as the one here

What can I do to improve?

Edit: Okay, here's my actual code (for N of 1E7):

int l = 10000000, n = 2, sqrt = (int) Math.sqrt(l);
boolean[] nums = new boolean[l + 1];
int[] primes = new int[664579];

while(n <= sqrt){
    for(int i = 2 * n; i <= l; nums[i] = true, i += n);
    for(n++; nums[n]; n++);
}

for(int i = 2, k = 0; i < nums.length; i++) if(!nums[i]) primes[k++] = i;

Runs in about 350ms on my 2.0GHz machine.

like image 773
Rob Avatar asked Jun 22 '10 15:06

Rob


2 Answers

While s is <= sqrt(N)
One mistake people often do in such algorithms is not precomputing square root.

while (s <= sqrt(N)) {

is much, much slower than

int limit = sqrt(N);
while (s <= limit) {

But generally speaking, Eiko is right in his comment. If you want people to offer low-level optimisations, you have to provide code.

update Ok, now about your code.

You may notice that number of iterations in your code is just little bigger than 'l'. (you may put counter inside first 'for' loop, it will be just 2-3 times bigger) And, obviously, complexity of your solution can't be less then O(l) (you can't have less than 'l' iterations).

What can make real difference is accessing memory effectively. Note that guy who wrote that article tries to reduce storage size not just because he's memory-greedy. Making compact arrays allows you to employ cache better and thus increase speed.

I just replaced boolean[] with int[] and achieved immediate x2 speed gain. (and 8x memory) And I didn't even try to do it efficiently.

update2
That's easy. You just replace every assignment a[i] = true with a[i/32] |= 1 << (i%32) and each read operation a[i] with (a[i/32] & (1 << (i%32))) != 0. And boolean[] a with int[] a, obviously.

From the first replacement it should be clear how it works: if f(i) is true, then there's a bit 1 in an integer number a[i/32], at position i%32 (int in Java has exactly 32 bits, as you know).

You can go further and replace i/32 with i >> 5, i%32 with i&31. You can also precompute all 1 << j for each j between 0 and 31 in array.

But sadly, I don't think in Java you could get close to C in this. Not to mention, that guy uses many other tricky optimizations and I agree that his could would've been worth a lot more if he made comments.

like image 184
Nikita Rybak Avatar answered Sep 28 '22 02:09

Nikita Rybak


Using the BitSet will use less memory. The Sieve algorithm is rather trivial, so you can simply "set" the bit positions on the BitSet, and then iterate to determine the primes.

like image 36
gpampara Avatar answered Sep 28 '22 01:09

gpampara