Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find how many times each number between N and M can be expressed as a sum of a pair of primes

Consider this method:

public static int[] countPairs(int min, int max) {
    int lastIndex = primes.size() - 1;
    int i = 0;
    int howManyPairs[] = new int[(max-min)+1];

    for(int outer : primes) {
        for(int inner : primes.subList(i, lastIndex)) {
            int sum = outer + inner;

            if(sum > max)
                break;

            if(sum >= min && sum <= max)
                howManyPairs[sum - min]++;
        }

        i++;
    }

    return howManyPairs;
}

As you can see, I have to count how many times each number between min and max can be expressed as a sum of two primes.

primes is an ArrayList with all primes between 2 and 2000000. In this case, min is 1000000 and max is 2000000, that's why primes goes until 2000000.

My method works fine, but the goal here is to do something faster.

My method takes two loops, one inside the other, and it makes my algorithm an O(n²). It sucks like bubblesort.

How can I rewrite my code to accomplish the same result with a better complexity, like O(nlogn)?

One last thing: I'm coding in Java, but your reply can be in also Python, VB.Net, C#, Ruby, C or even just a explanation in English.

like image 986
Gabriel Avatar asked Aug 30 '15 03:08

Gabriel


3 Answers

For each number x between min and max, we want to compute the number of ways x can be written as the sum of two primes. This number can also be expressed as

sum(prime(n)*prime(x-n) for n in xrange(x+1))

where prime(x) is 1 if x is prime and 0 otherwise. Instead of counting the number of ways that two primes add up to x, we consider all ways two nonnegative integers add up to x, and add 1 to the sum if the two integers are prime.

This isn't a more efficient way to do the computation. However, putting it in this form helps us recognize that the output we want is the discrete convolution of two sequences. Specifically, if p is the infinite sequence such that p[x] == prime(x), then the convolution of p with itself is the sequence such that

convolve(p, p)[x] == sum(p[n]*p[x-n] for n in xrange(x+1))

or, substituting the definition of p,

convolve(p, p)[x] == sum(prime(n)*prime(x-n) for n in xrange(x+1))

In other words, convolving p with itself produces the sequence of numbers we want to compute.

The straightforward way to compute a convolution is pretty much what you were doing, but there are much faster ways. For n-element sequences, a fast Fourier transform-based algorithm can compute the convolution in O(n*log(n)) time instead of O(n**2) time. Unfortunately, this is where my explanation ends. Fast Fourier transforms are kind of hard to explain even when you have proper mathematical notation available, and as my memory of the Cooley-Tukey algorithm isn't as precise as I'd like it to be, I can't really do it justice.

If you want to read more about convolution and Fourier transforms, particularly the Cooley-Tukey FFT algorithm, the Wikipedia articles I've just linked would be a decent start. If you just want to use a faster algorithm, your best bet would be to get a library that does it. In Python, I know scipy.signal.fftconvolve would do the job; in other languages, you could probably find a library pretty quickly through your search engine of choice.

like image 189
user2357112 supports Monica Avatar answered Sep 28 '22 19:09

user2357112 supports Monica


What you´re searching is the count of Goldbach partitions for each number
in your range, and imho there is no efficient algorithm for it.

Uneven numbers have 0, even numbers below 4*10^18 are guaranteed to have more than 0,
but other than that... to start with, if even numbers (bigger than 4*10^18) with 0 partitions exist
is an unsolved problem since 1700-something, and such things as exact numbers are even more complicated.

There are some asymptotic and heuristic solutions, but if you want the exact number,
other than getting more CPU and RAM, there isn´t be much you can do.

like image 43
deviantfan Avatar answered Sep 28 '22 20:09

deviantfan


The other answers have an outer loop that goes from N to M. It's more efficient, however, for the outer loop (or loops) to be pairs of primes, used to build a list of numbers between N and M that equal their sums.

Since I don't know Java I'll give a solution in Ruby for a specific example. That should allow anyone interested to implement the algorithm in Java, regardless of whether they know Ruby.

I initially assume that two primes whose product equals a number between M and N must be unique. In other words, 4 cannot be express as 4 = 2+2.

Use Ruby's prime number library.

require 'prime'

Assume M and N are 5 and 50.

lower =  5
upper = 50

Compute the prime numbers up to upper-2 #=> 48, the 2 being the first prime number.

primes = Prime.each.take_while { |p| p < upper-2 }
  #=> [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]

Construct an enumerator of all combinations of two primes.

enum = primes.combination(2)
  => #<Enumerator: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]:combination(2)>

We can see the elements that will be generated by this enumerator by converting it to an array.

enum.to_a
  #=> [[2, 3], [2, 5],..., [2, 47], [3, 5],..., [43, 47]] (105 elements)

Just think of enum as an array.

Now construct a counting hash whose keys are numbers between lower and upper for which there is at least one pair of primes that sum to that number and whose values are the numbers of pairs of primes that sum to the value of the associated key.

enum.each_with_object(Hash.new(0)) do |(x,y),h|
  sum = x+y
  h[sum] += 1 if (lower..upper).cover?(sum)
end
  #=> {5=>1, 7=>1, 9=>1, 13=>1, 15=>1, 19=>1, 21=>1, 25=>1, 31=>1, 33=>1,
  #    39=>1, 43=>1, 45=>1, 49=>1, 8=>1, 10=>1, 14=>1, 16=>2, 20=>2, 22=>2,
  #    26=>2, 32=>2, 34=>3, 40=>3, 44=>3, 46=>3, 50=>4, 12=>1, 18=>2, 24=>3,
  #    28=>2, 36=>4, 42=>4, 48=>5, 30=>3, 38=>1} 

This shows, for example, that there are two ways that 16 can be expressed as the sum of two primes (3+13 and 5+11), three ways for 34 (3+31, 5+29 and 11+23) and no way for 6.

If the two primes being summed need not be unique (e.g., 4=2+2 is to be included), only a slight change is needed.

arr = primes.combination(2).to_a.concat(primes.zip(primes))

whose sorted values are

a = arr.sort
  #=> [[2, 2], [2, 3], [2, 5], [2, 7],..., [3, 3],..., [5, 5],.., [47, 47]] (120 elements) 

then

a.each_with_object(Hash.new(0)) do |(x,y),h|
  sum = x+y
  h[sum] += 1 if (lower..upper).cover?(sum)
end
  #=> {5=>1, 7=>1, 9=>1, 13=>1, 15=>1, 19=>1, 21=>1, 25=>1, 31=>1, 33=>1,
  #    39=>1, 43=>1, 45=>1, 49=>1, 6=>1, 8=>1, 10=>2, 14=>2, 16=>2, 20=>2,
  #    22=>3, 26=>3, 32=>2, 34=>4, 40=>3, 44=>3, 46=>4, 50=>4, 12=>1, 18=>2,
  #    24=>3, 28=>2, 36=>4, 42=>4, 48=>5, 30=>3, 38=>2} 

a should be replaced by arr. I used a here merely to order the elements of the resulting hash so that it would be easier to read.

Since I just wanted to describe the approach, I used a brute force method to enumerate the pairs of elements of primes, throwing away 44 of the 120 pairs of primes because their sums fall outside the range 5..50 (a.count { |x,y| !(lower..upper).cover?(x+y) } #=> 44). Clearly, there's considerable room for improvement.

like image 26
Cary Swoveland Avatar answered Sep 28 '22 20:09

Cary Swoveland