Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can anyone perhaps teach me how to further optimize this 'print up to the nth prime number' script? [closed]

I'm a 17 year old getting started with programming with the help of the Python programming language.

I've been seeking to optimize this algorithm, perhaps by eliminating one of the loops, or with a better test to check for prime numbers.

Trying to calculate and display 100000 prime numbers has the script pausing for about 6 seconds as it populates the list with primes before the primes list is returned to the console as output.

I've been experimenting with using

print odd,

to simply print every found prime number, which is faster for smaller inputs like n = 1000, but for n = 1000000 the list itself prints much faster (both in the python shell and in the console).

Perhaps the entire code/algorithm should be revamped, but the script should remain essentially the same: The user types in the number of prime numbers to be printed (n) and the script returns all prime numbers up to the nth prime number.

from time import time
odd = 1
primes = [2]
n = input("Number of prime numbers to print: ")
clock = time()
def isPrime(number):
    global primes
    for i in primes:
        if i*i > number:
            return True
        if number%i is 0:
            return False
while len(primes) < n:
    odd += 2
    if isPrime(odd):
        primes += [odd]
print primes
clock -= time()
print "\n", -clock
raw_input()

I might wanna rewrite the whole script to use a sieve like the Sieve of Atkin: http://en.wikipedia.org/wiki/Sieve_of_Atkin

However, I am simply a beginner at Python (or even at programming: I started writing code only 2 weeks ago) and it would be quite a challenge for me to figure out how to code a Sieve of Atkin algorithm in Python.

I wish a google hacker out there would hand hold me through stuff like this :(

like image 570
Sweetgirl17 Avatar asked Nov 05 '22 17:11

Sweetgirl17


2 Answers

You could use prime sieve, and with a simple twist:

  1. Define the first prime 2 as you do, set the largest number reached (max) to 2;
  2. Generate a list of n consecutive numbers from max+1 to max+n;
  3. Use sieve with the primes on this list. When sieving, set the beginning number for each prime to the smallest number in the list that could be divided by the prime;
  4. If the amount is not reacher, goto 2.

This way, you could control the length of the list, and as the length grows larger, the speed will be faster. However, this is a total rework of the algorithm, and is harder to program.

Here's a sample code, which is quite crude, but this only takes less than 70% time of the original:

from math import sqrt
from time import time
primes = [2]
max = 3
n = input("Number of prime numbers to print: ")
r=2
clock = time()
def sieve(r):
    global primes
    global max
    s = set(range(max,max+r))
    for i in primes:
        b=max//i
        if (b*i<max):
            b=b+1
        b=b*i
        while b<=max+r-1:
            if b in s:
                s.remove(b)
            b=b+i
    for i in s:
        primes.append(i)
while len(primes) < n:
    r=primes[-1]
    sieve(r)
    max=max+r
primes=primes[0:n]
print primes
clock -= time()
print "\n", -clock
raw_input()

There are many ways to improve this, this just shows the notion of the approach.

Also, this can blow up the memory when the number is large. I used the dynamic limit try to somewhat relieve this.

And if you are really curious (and fearless), you could look at the more complicated implementations in various open source projects. One example is Pari/GP, which is written in C++, and is blazing fast (I tested 1 to 50000000 in less than 1 min, if I remember correctly). Translating them to Python may be hard, but will be helpful, perhaps not just for yourself;-)

like image 78
zw324 Avatar answered Nov 09 '22 04:11

zw324


One simple optimizations which could be applied without hacking the code completely.

  • the i*i on every prime gets very wasteful as the list gets longer. Instead calculate the square root of i outside the loop and test against this value inside the loop.

However square root is itself and expensive calculation and the majority of candidate numbers will be rejected as divisible by one of the lower primes (3,5,7) so this turns out to be not such a good optimization (pessimization?). But we don't actually need to be that precise and a simple check that the prime is less than one third of the value has a similar effect without the computational cost of the square root calculation, but, at the expense of a relatively few unnecessary test.

like image 33
James Anderson Avatar answered Nov 09 '22 05:11

James Anderson