Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generate primes from 1 to n, crashing for n > 300 million

Any suggestions as to how I can get this program to work for n = 1 trillion (aside from making upgrades / buying a new computer)?

The error is as follows: after building, the program being executing (the command line style output window pops up) and then quickly closes out, and I get the following error "ProjectPrimes.exe has stopped working (Windows is looking for a solution to this problem." I suspect this has to do with memory issues as I first encountered it at n = 20 million but that was before I chose to malloc/free the 'sieve' array (i.e., my 'sieve' array is the big array of dimensions n x 1 and with each element consisting of a 1 or 0).

The program takes ~35 seconds to run through the first 300 million integers (16,252,325 primes) so it's okay but nothing spectacular. As I'd mentioned, the goal is to be able to generate primes below 1 trillion so am a long ways off...

If relevant, here are my machine specs (in case the goal happens to be unreasonable on this machine): 2.40ghz i5, 4gb RAM, 64bit Windows 7.

An overview of methodology, for those who aren't familiar: we use the Sieve of Sundaram approach. Without getting into the proof, we first eliminate all odd non-primes below an integer "n" using the sieve function: [2*(i+j+2*i*j)+1 | i <- [1..n/2], j <- [i..an optimized upper bound]]. We then cross off the even numbers (excluding two, of course). Which leaves us with the primes.

Why does the prime function return (a pointer to an array containing) the complete set of primes below n? Well, the goal is to be able to identify both (i) the number of primes below n as well as (ii) list the primes below n. This is also why I chose to pass a pointer for the count of primes below n as an argument.

Here's the not-so-exciting 'main' function:

int main() {
    long ceiling = 300*1000*1000;
    long *numPrimes;
    long *primes;

    primes = primesToSS(ceiling+1, numPrimes);
    printf("\n\nThere are %d primes below %d.\n\n",*numPrimes,ceiling);

    free(primes);
    return 0;
}

And here's the meat:

//n represents the ceiling, i.e., the integer below which we will generate primes
//cnt* is a pointer which will point the number of primes below n

long* primesToSS( long n, long* cnt ) {

    //initialize sieve by setting all elements equal to 1 (except for 0 and 1)
    long *sieve = malloc(n*sizeof(long));
    initArray(sieve, n, 1);
    sieve[0] = 0; sieve[1] = 0;

    //eliminate all odd composite numbers
    for (int i = 1; i <= n/2; ++i)
        for (int j = i; j <= (n-2*i)/(2*(2*i+1)); ++j)
            sieve[ 2*(i+j+2*i*j)+1 ] = 0;

    //eliminate all even numbers greater than two 
    //and count total number of primes below n
    long numPrimes = 1;
    for (int i = 3; i < n; ++i) {
        if (i % 2 == 0) sieve[i] = 0;
        numPrimes += sieve[i];
    }
    *cnt = numPrimes;

    //create array of primes
    long *primes = malloc(numPrimes*sizeof(int));
    long counter = 0;
    for (int i = 0; i < n; ++i) {
        if (sieve[i] == 1) {
            primes[counter] = i;
            counter++; } 
    }
    free(sieve);
    return primes;
}

void initArray( int* arr, int len, int n ) {
    for( int i = 0; i < len; ++i) arr[i] = n; }
like image 507
iceman Avatar asked Dec 25 '22 03:12

iceman


1 Answers

Let's make some observations:

  • As people have mentioned in the comments and answers, you only need 1 bit to store whether a number is a prime. Therefore, you can pack the data for 8 numbers into 1 byte. Implement your own bitset data structure to make the code cleaner.
  • Also mentioned in the comments, since all prime numbers larger than 2 are odd numbers, there is no need to store data for even numbers. This allow you to pack 16 numbers into 1 byte.
  • Following the idea of wheel factorization, you can further pack 24 numbers into 1 bytes, if you store only the bits for numbers congruent to 1 or 5 modulo 6. Higher modulo will save slightly more space (with diminishing return), but the logic to access the bitset data structure may slow down the algorithm.
  • For the program to work with 1 trillion (1012) numbers, as you sieve, you only need to collect the list of prime numbers less than 1 million (106). Larger numbers are unnecessary, since the other factor in the compound numbers less than 1 trillion would already have been covered by the list.
  • Packing numbers at 16 numbers/byte (the most basic setup you should do), you would require ~58 GiB of RAM. However, there is no need to allocate for the whole range. Just allocate for a smaller range of a few hundred million to a few billion numbers (try to vary the number for highest performance), which translate to at most a few hundred MiB, then use the list of prime number less than 1 million to sieve the ranges.

    This step can be done in parallel - you only need to share the list of prime numbers less than 1 million.

    By the way, this technique is called segmented sieve.

like image 115
nhahtdh Avatar answered Dec 27 '22 16:12

nhahtdh