Does this prime function actually work?

Tags:

Since I'm starting to get the hang of Python, I'm starting to test my newly acquired Python skills on some problems on projecteuler.net.

Anyways, at some point, I ended up making a function for getting a list of all primes up until a number 'n'.

Here's how the function looks atm:

def primes(n):
    """Returns list of all the primes up until the number n."""

    # Gather all potential primes in a list.
    primes = range(2, n + 1)
    # The first potential prime in the list should be two.
    assert primes[0] == 2
    # The last potential prime in the list should be n.
    assert primes[-1] == n

    # 'p' will be the index of the current confirmed prime.
    p = 0
    # As long as 'p' is within the bounds of the list:
    while p < len(primes):
        # Set the candidate index 'c' to start right after 'p'.
        c = p + 1
        # As long as 'c' is within the bounds of the list:
        while c < len(primes):
            # Check if the candidate is divisible by the prime.
            if(primes[c] % primes[p] == 0):
                # If it is, it isn't a prime, and should be removed.
                primes.pop(c)
            # Move on to the next candidate and redo the process.
            c = c + 1
        # The next integer in the list should now be a prime, 
        # since it is not divisible by any of the primes before it. 
        # Thus we can move on to the next prime and redo the process.
        p = p + 1       
    # The list should now only contain primes, and can thus be returned.
    return primes

It seems to work fine, although one there's one thing that bothers me. While commenting the code, this piece suddenly seemed off:

# Check if the candidate is divisible by the prime.
if(primes[c] % primes[p] == 0):
    # If it is, it isn't a prime, and should be removed from the list.
    primes.pop(c)
# Move on to the next candidate and redo the process.
c += 1

If the candidate IS NOT divisible by the prime we examine the next candidate located at 'c + 1'. No problem with that.

However, if the candidate IS divisible by the prime, we first pop it and then examine the next candidate located at 'c + 1'. It struck me that the next candidate, after popping, is not located at 'c + 1', but 'c', since after popping at 'c', the next candidate "falls" into that index.

I then thought that the block should look like the following:

# If the candidate is divisible by the prime:
if(primes[c] % primes[p] == 0):
    # If it is, it isn't a prime, and should be removed from the list.
    primes.pop(c)
# If not:
else:
    # Move on to the next candidate.
    c += 1

This above block seems more correct to me, but leaves me wondering why the original piece apparently worked just fine.

So, here are my questions:

After popping a candidate which turned out not be a prime, can we assume, as it is in my original code, that the next candidate is NOT divisible by that same prime?

If so, why is that?

Would the suggested "safe" code just do unnecessary checks on the candidates which where skipped in the "unsafe" code?

PS:

I've tried writing the above assumption as an assertion into the 'unsafe' function, and test it with n = 100000. No problems occurred. Here's the modified block:

# If the candidate is divisible by the prime:
if(primes[c] % primes[p] == 0):
    # If it is, it isn't a prime, and should be removed.
    primes.pop(c)
    # If c is still within the bounds of the list:
    if c < len(primes):
        # We assume that the new candidate at 'c' is not divisible by the prime.
        assert primes[c] % primes[p] != 0
# Move on to the next candidate and redo the process.
c = c + 1

697

asked Dec 06 '12 16:12

phaz

2 Answers

It fails for much bigger numbers. The first prime is 71, for that the candidate can fail. The smallest failing candidate for 71 is 10986448536829734695346889 which overshadows the number 10986448536829734695346889 + 142.

def primes(n, skip_range=None):
    """Modified "primes" with the original assertion from P.S. of the question.
    with skipping of an unimportant huge range.
    >>> primes(71)
    [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]
    >>> # The smallest failing number for the first failing prime 71:
    >>> big_n = 10986448536829734695346889
    >>> primes(big_n + 2 * 71, (72, big_n))
    Traceback (most recent call last):
    AssertionError
    """
    if not skip_range:
        primes = list(range(2, n + 1))
    else:
        primes = list(range(2, skip_range[0]))
        primes.extend(range(skip_range[1], n + 1))
    p = 0
    while p < len(primes):
        c = p + 1
        while c < len(primes):
            if(primes[c] % primes[p] == 0):
                primes.pop(c)
                if c < len(primes):
                    assert primes[c] % primes[p] != 0
            c = c + 1
        p = p + 1
    return primes

# Verify that it can fail.
aprime = 71   # the first problematic prime 
FIRST_BAD_NUMBERS = (
        10986448536829734695346889, 11078434793489708690791399,
        12367063025234804812185529, 20329913969650068499781719,
        30697401499184410328653969, 35961932865481861481238649,
        40008133490686471804514089, 41414505712084173826517629,
        49440212368558553144898949, 52201441345368693378576229)

for bad_number in FIRST_BAD_NUMBERS:
    try:
        primes(bad_number + 2 * aprime, (aprime + 1, bad_number))
        raise Exception('The number {} should fail'.format(bad_number))
    except AssertionError:
        print('{} OK. It fails as is expected'.format(bad_number))

I solved these numbers by a complicated algorithm like a puzzle by searching possible remainders of n modulo small primes. The last simple step was to get the complete n (by chinese remainder theorem in three lines of Python code). I know all 120 basic solutions smaller than primorial(71) = 2 * 3 * 5 * 7 * 11 * 13 * 17 * 19 * 23 * 29 * 31 * 37 * 41 * 43 * 47 * 53 * 59 * 61 * 67 * 71 repeated periodically by all multiples of this number. I rewrote the algorithm many times for every decade of tested primes because for every decade was the solution much slower than for the previous. Maybe I find a smaller solution with the same algorithm for primes 73 or 79 in acceptable time.

Edit:

I would like to find also a complete silent fail of the unsafe original function. Maybe exists some candidate composed from different primes. This way of solution would only postpone the final outcome for later. Every step would be much more and more expensive for time and resources. Therefore only numbers composed from one or two primes are attractive.

I expect that only two solutions the hidden candidate c are good: c = p ** n or c = p1 * p ** n or c = p1 ** n1 * p ** n where p and p1 are primes and n is a power greater than 1. The primes function fails if c - 2 * p is divisible by no prime smaller than p and if all number between c-2n and c are divisible by any prime smaller than p. The variant p1*p**n requires also that the same c had failed before for p1 (p1 < p) as we already know infinite number of such candidates.

EDIT: I found a smaller example of failure: number 121093190175715194562061 for the prime 79. (which is about ninety times less than for 71) I can't continue by the same algorithm to find smaller examples because all 702612 basic solutions took more than 30 hours for the prime 79 on my laptop.

I also verified it for all candidates smaller than 400000000 (4E10) and for all relevant primes, that no candidate will fail the assertion in the question. Until you have terabytes of memory and thousands years of time, the assertion in the algorithm will pass, because your time complexity is O((n / log(n)) ^2) or very similar.

107

answered Oct 16 '22 17:10

hynekcer

Your observation seems to be accurate, which is quite a good catch.

I suspect the reason that it works, at least in some cases, is because composite numbers are actually factored into multiple primes. So, the inner loop may miss the value on the first factor, but it then picks it up on a later factor.

For a small'ish "n", you can print out values of the list to see if this is what is happening.

This method of finding primes, by the way, is based on the Sieve of Eratothenes. It is possible when doing the sieve that if "c" is a multiple of "p", then the next value is never a multiple of the same prime.

The question is: are there any cases where all values between p*x and p*(x+1) are divisible by some prime less than p and p*x+1). (This is where the algorithm would miss a value and it would not be caught later.) However, one of these values is even, so it would be eliminated on round "2". So, the real question is whether there are cases where all values between p*x and p*(x+2) are divisible by numbers less than p.

Off hand, I can't think of any numbers less than 100 that meet this condition. For p = 5, there is always a value that is not divisible by 2 or 3 between two consecutive multiples of 5.

There seems to be a lot written on prime gaps and sequences, but not so much on sequences of consecutive integers divisible by numbers less than p. After some (okay, a lot) of trial and error, I've determined that every number between 39,474 (17*2,322) and 39,491 (17*2,233) is divisible by an integer less than 17:

I am not sure if this is the first such value. However, we would have to find sequences twice as long as this. I think that is unlikely, but not sure if there is a proof.

My conclusion is that the original code might work, but that your fix is the right thing to do. Without a proof that there are no such sequences, it looks like a bug, albeit a bug that could be very, very, very rare.

answered Oct 16 '22 18:10

Gordon Linoff

Related questions
                            
                                Replace multiple elements in numpy array with 1
                            
                                Python type hint for classes that support __getitem__
                            
                                Why is caplog.text empty, even though the function I'm testing is logging?
                            
                                Numpyic way to sort an ndarray clockwise?
                            
                                Can someone explain pipe buffer deadlock?
                            
                                Python class design - Splitting up big classes into multiple ones to group functionality
                            
                                Does NLTK have a tool for dependency parsing?
                            
                                Python socket send EOF
                            
                                Making decimal.Decimal the default numerical type in Python
                            
                                Installing Python eggs under PyPy
                            
                                Finding a Python Library to Mock a Database [closed]
                            
                                Convolution computations in Numpy/Scipy
                            
                                Python pytz Converting a timestamp (string format) from one timezone to another
                            
                                python: multiprocessing Event
                            
                                Running Python script from Cocoa application using GCD
                            
                                Sort dict by value and return dict, not list of tuples [duplicate]
                            
                                Python C-API functions that borrow and steal references
                            
                                Python class that extends int doesn't entirely behave like an int
                            
                                What NLP tools to use to match phrases having similar meaning or semantics
                            
                                Django ManagementForm data is missing or has been tampered with

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Does this prime function actually work?

Tags:

python

algorithm

math

python-2.7

primes

phaz

People also ask

2 Answers

hynekcer

Gordon Linoff

Recent Activity

Donate For Us