Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding the Nth Twin Prime

I was trying to solve a problem on SPOJ. We are required to calculate the nth twin prime pair( primes differing by 2). n can be as large as 10^5. I tried a precalculation using a sieve, I had to sieve up to 10^8 to get the maximum n twin prime, but the time limit is strict(2s) and it times out. I noticed people have solved it in 0.00 seconds, so i looked around for a formula on google, and couldnt get anything helpful. Could someone please guide me?

Thanks in advance!!

like image 691
frodo Avatar asked Apr 13 '12 15:04

frodo


People also ask

How to find twin primes up to a number n?

The steps to find twin primes up to a number N are: 1 Identify the first prime numbers up to N 2 Identify the Twin prime pairs among the identified primes 3 Display result More ...

What is a twin prime pair?

A twin Prime Pair is a pair of prime numbers (a,b) such that a is less than or greater than b by 2. In other words, they are prime numbers pairs such that the difference between them is exactly equal to two. The Twin prime conjecture states that there are infinitely many twin primes.

What is the difference between composite numbers and twin primes?

The numbers which have more than two factors, are called composite numbers. For example: 4,6,8,10,12,14,15, etc. are composite numbers. Q.1: Give three pairs of prime numbers whose difference is 2. Solution: The three pairs of prime numbers are (3, 5), (5, 7) and (11, 13). Q.2: What is the difference between twin primes and co-prime numbers?

How do I use the nth prime page?

Welcome to the Nth Prime Page! Here's how it works: Enter a value for n below, from 1 to 10 12, inclusive. The server will return the n th prime number (counting 2 as the first). Commas and scientific notation (e.g. 1.0e12) are allowed.


1 Answers

Out of curiosity, I solved the problem using two variants of a Sieve of Eratosthenes. The first variant completed on the testing machine in 0.93s and the second in 0.24s. For comparison, on my computer, the first finished in 0.08s and the second in 0.04s.

The first was a standard sieve on the odd numbers, the second a slightly more elaborate sieve omitting also the multiples of 3 in addition to the even numbers.

The testing machines of SPOJ are old and slow, so a programme runs much longer on them than on a typical recent box; and they have small caches, therefore it is important to keep the computation small.

Doing that, a Sieve of Eratosthenes is easily fast enough. However, it is really important to keep memory usage small. The first variant, using one byte per number, gave "Time limit exceeded" on SPOJ, but ran in 0.12s on my box. So, given the characteristics of the SPOJ testing machines, use a bit-sieve to solve it in the given time.

On the SPOJ machine, I got a significant speedup (running time 0.14s) by further reducing the space of the sieve by half. Since - except for the first pair (3,5) - all prime twins have the form (6*k-1, 6*k+1), and you need not know which of the two numbers is composite if k doesn't give rise to a twin prime pair, it is sufficient to sieve only the indices k.

(6*k + 1 is divisible by 5 if and only if k = 5*m + 4 for some m, and 6*k - 1 is divisible by 5 if and only if k = 5*m+1 for some m, so 5 would mark off 5*m ± 1, m >= 1 as not giving rise to twin primes. Similarly, 6*k+1 is divisible by 13 if and only if k = 13*m + 2 for some m and 6*k - 1 if and only if k = 13*m - 2 for some m, so 13 would mark off 13*m ± 2.)

This doesn't change the number of markings, so with a sufficiently large cache, the change in running time is small, but for small caches, it's a significant speedup.

One more thing, though. Your limit of 108 is way too high. I used a lower limit (20 million) that doesn't overestimate the 100,000th twin prime pair by so much. With a limit of 108, the first variant would certainly not have finished in time, the second probably not.

With the reduced limit, a Sieve of Atkin needs to be somewhat optimised to beat the Eratosthenes variant omitting even numbers and multiples of 3, a naive implementation will be significantly slower.


Some remarks concerning your (wikipedia's pseudocode) Atkin sieve:

#define limit 100000000
int prime1[MAXN];
int prime2[MAXN];

You don't need the second array, the larger partner of a prime twin pair can easily be computed from the smaller. You're wasting space and destroy cache locality reading from two arrays. (That's minor compared to the time needed for sieving, though.)

    int root = ceil(sqrt(limit));
    bool sieve[limit];

On many operating systems nowadays, that is an instant segfault, even with a reduced limit. The stack size is often limited to 8MB or less. Arrays of that size should be allocated on the heap.

As mentioned above, using one bool per number makes the programme run far slower than necessary. You should use a std::bitset or std::vector<bool> or twiddle the bits yourself. Also it is advisable to omit at least the even numbers.

    for (int x = 1; x <= root; x++)
    {
        for (int y = 1; y <= root; y++)
        {
//Main part of Sieve of Atkin
            int n = (4*x*x)+(y*y);
            if (n <= limit && (n % 12 == 1 || n % 12 == 5)) sieve[n] ^= true;
            n = (3*x*x)+(y*y);
            if (n <= limit && n % 12 == 7) sieve[n] ^= true;
            n = (3*x*x)-(y*y);
            if (x > y && n <= limit && n % 12 == 11) sieve[n] ^= true;
        }
    }

This is horribly inefficient. It tries far too many x-y-combinations, for each combination it does three or four divisions to check the remainder modulo 12 and it hops back and forth in the array.

Separate the different quadratics.

For 4*x^2 + y^2, it is evident that you need only consider x < sqrt(limit)/2 and odd y. Then the remainder modulo 12 is 1, 5, or 9. If the remainder is 9, then 4*x^2 + y^2 is actually a multiple of 9, so such a number would be eliminated as not square-free. However, it is preferable to omit the multiples of 3 from the sieve altogether and treat the cases n % 12 == 1 and n % 12 == 5 separately.

For 3*x^2 + y^2, it is evident that you need only consider x < sqrt(limit/3) and a little bit of thought reveals that x must be odd and y even (and not divisible by 3).

For 3*x^2 - y^2 with y < x, it is evident that you need only consider y < sqrt(limit/2). Looking at the remainders modulo 12, you see that y mustn't be divisible by 3 and x and y must have different parity.

like image 148
Daniel Fischer Avatar answered Nov 15 '22 06:11

Daniel Fischer