Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Retrieving the next smallest element of a Python set

The Sieve of Eratosthenes is a reasonably fast method of generating primes up to a limit k as follows:

  1. Begin with the set p = (2, 3, 4, ..., k) and i = 2.
  2. Starting from i^2, remove all multiples of i from p.
  3. Repeat for the next smallest i in p, until i >= sqrt(k).

My current implementation looks like this (with the obvious optimisation of pre-filtering all the even numbers):

# Compute all prime numbers less than k using the Sieve of Eratosthenes
def sieve(k):
    s = set(range(3, k, 2))
    s.add(2)

    for i in range(3, int(sqrt(k)), 2):
        if i in s:
            for j in range(i ** 2, k, i * 2):
                s.discard(j)

    return sorted(s)

EDIT: Here is the equivalent list based code:

def sieve_list(k):
    s = [True] * k
    s[0] = s[1] = False
    for i in range(4, k, 2):
        s[i] = False

    for i in range(3, int(sqrt(k)) + 2, 2):
        if s[i]:            
            for j in range(i ** 2, k, i * 2):
                s[j] = False

    return [2] + [ i for i in range(3, k, 2) if s[i] ]

This works, but is not entirely correct. The lines:

for i in range(3, int(sqrt(k)), 2):
    if i in s:
        [...]

Find the next smallest element of s by testing set membership for every odd number. Ideally the implementation should actually be:

while i < sqrt(k):
    [...]
    i = next smallest element in s

However, since set is unordered, I do not know how (or even if it is possible) to get the next smallest element in a more efficient way. I have considered using a list with True/False flags for primes, but you still have to walk the list looking for the next True element. You can't just actually remove elements from the list either, as this makes efficiently removing composite numbers in step 2 impossible.

Is there a way to find the next smallest element more efficiently? If not, is there some other data structure that allows O(1) removal by value and an efficient way to find the next smallest element?

like image 708
verdesmarald Avatar asked Sep 16 '12 15:09

verdesmarald


People also ask

How do I find the second lowest value in a list in Python?

To find the second smallest number in a list: Use the set() class to convert the list to a set . Use the sorted() function to get a sorted list. Access the list item at index 1 to get the second smallest number.

How do you find the smallest number in a set in Python?

Use Python's min() and max() to find smallest and largest values in your data. Call min() and max() with a single iterable or with any number of regular arguments.

How do you extract an element from a set in Python?

We can access the first item in the set by using the iter() function, we have to apply the next() to it to get the first element. We can also use the first() method from the iteration_utilities module, Which will return the first element.

How do you access set elements by index in Python?

There is no index attached to any element in a python set. So they do not support any indexing or slicing operation.


1 Answers

Sets are unordered because they are internally implemented as a hashset. There's no efficient way to find the minimum element in such a data structure; min(s) would be the most Pythonic way to do so (but it is O(n)).

You can use a collections.deque along with your set. Use the deque to store the list of elements in sorted order. Every time you need to get the minimum, pop elements off the deque until you find one that is in your set. This has amortized O(1) cost across your entire input array (since you only have to pop n times).

I should also point out that there can be no data structure that has O(n) creation from a list (or O(1) insertion), O(1) removal by value and O(1) minimum-finding; such a data structure could be used to trivially implement O(n) general sorting, which is (information-theoretic) impossible. The hashset gets pretty close, but has to sacrifice efficient minimum-finding.

like image 95
nneonneo Avatar answered Oct 13 '22 00:10

nneonneo