Why is Sieve of Eratosthenes more efficient than the simple "dumb" algorithm?

Tags:

If you need to generate primes from 1 to N, the "dumb" way to do it would be to iterate through all the numbers from 2 to N and check if the numbers are divisable by any prime number found so far which is less than the square root of the number in question.

As I see it, sieve of Eratosthenes does the same, except other way round - when it finds a prime N, it marks off all the numbers that are multiples of N.

But whether you mark off X when you find N, or you check if X is divisable by N, the fundamental complexity, the big-O stays the same. You still do one constant-time operation per a number-prime pair. In fact, the dumb algorithm breaks off as soon as it finds a prime, but sieve of Eratosthenes marks each number several times - once for every prime it is divisable by. That's a minimum of twice as many operations for every number except primes.

Am I misunderstanding something here?

746

asked Apr 13 '11 12:04

Vilx-

3 Answers

In the trial division algorithm, the most work that may be needed to determine whether a number n is prime, is testing divisibility by the primes up to about sqrt(n).

That worst case is met when n is a prime or the product of two primes of nearly the same size (including squares of primes). If n has more than two prime factors, or two prime factors of very different size, at least one of them is much smaller than sqrt(n), so even the accumulated work needed for all these numbers (which form the vast majority of all numbers up to N, for sufficiently large N) is relatively insignificant, I shall ignore that and work with the fiction that composite numbers are determined without doing any work (the products of two approximately equal primes are few in number, so although individually they cost as much as a prime of similar size, altogether that's a negligible amount of work).

So, how much work does the testing of the primes up to N take?

By the prime number theorem, the number of primes <= n is (for n sufficiently large), about n/log n (it's n/log n + lower order terms). Conversely, that means the k-th prime is (for k not too small) about k*log k (+ lower order terms).

Hence, testing the k-th prime requires trial division by pi(sqrt(p_k)), approximately 2*sqrt(k/log k), primes. Summing that for k <= pi(N) ~ N/log N yields roughly 4/3*N^(3/2)/(log N)^2 divisions in total. So by ignoring the composites, we found that finding the primes up to N by trial division (using only primes), is Omega(N^1.5 / (log N)^2). Closer analysis of the composites reveals that it's Theta(N^1.5 / (log N)^2). Using a wheel reduces the constant factors, but doesn't change the complexity.

In the sieve, on the other hand, each composite is crossed off as a multiple of at least one prime. Depending on whether you start crossing off at 2*p or at p*p, a composite is crossed off as many times as it has distinct prime factors or distinct prime factors <= sqrt(n). Since any number has at most one prime factor exceeding sqrt(n), the difference isn't so large, it has no influence on complexity, but there are a lot of numbers with only two prime factors (or three with one larger than sqrt(n)), thus it makes a noticeable difference in running time. Anyhow, a number n > 0 has only few distinct prime factors, a trivial estimate shows that the number of distinct prime factors is bounded by lg n (base-2 logarithm), so an upper bound for the number of crossings-off the sieve does is N*lg N.

By counting not how often each composite gets crossed off, but how many multiples of each prime are crossed off, as IVlad already did, one easily finds that the number of crossings-off is in fact Theta(N*log log N). Again, using a wheel doesn't change the complexity but reduces the constant factors. However, here it has a larger influence than for the trial division, so at least skipping the evens should be done (apart from reducing the work, it also reduces storage size, so improves cache locality).

So, even disregarding that division is more expensive than addition and multiplication, we see that the number of operations the sieve requires is much smaller than the number of operations required by trial division (if the limit is not too small).

Summarising:
Trial division does futile work by dividing primes, the sieve does futile work by repeatedly crossing off composites. There are relatively few primes, but many composites, so one might be tempted to think trial division wastes less work.
But: Composites have only few distinct prime factors, while there are many primes below sqrt(p).

answered Oct 17 '22 19:10

Daniel Fischer

In the naive method, you do O(sqrt(num)) operations for each number num you check for primality. Ths is O(n*sqrt(n)) total.

In the sieve method, for each unmarked number from 1 to n you do n / 2 operations when marking multiples of 2, n / 3 when marking those of 3, n / 5 when marking those of 5 etc. This is n*(1/2 + 1/3 + 1/5 + 1/7 + ...), which is O(n log log n). See here for that result.

So the asymptotic complexity is not the same, like you said. Even a naive sieve will beat the naive prime-generation method pretty fast. Optimized versions of the sieve can get much faster, but the big-oh remains unchanged.

The two are not equivalent like you say. For each number, you will check divisibility by the same primes 2, 3, 5, 7, ... in the naive prime-generation algorithm. As you progress, you check divisibility by the same series of numbers (and you keep checking against more and more as you approach your n). For the sieve, you keep checking less and less as you approach n. First you check in increments of 2, then of 3, then 5 and so on. This will hit n and stop much faster.

answered Oct 17 '22 20:10

IVlad

Because with the sieve method, you stop marking mutiples of the running primes when the running prime reaches the square root of N.

Say, you want to find all primes less than a million.

First you set an array

for i = 2 to 1000000
  primetest[i] = true

Then you iterate

for j=2 to 1000         <--- 1000 is the square root of 10000000
  if primetest[j]                                    <--- if j is prime
    ---mark all multiples of j (except j itself) as "not a prime"
    for k = j^2 to 1000000 step j
      primetest[k] = false

You don't have to check j after 1000, because j*j will be more than a million. And you start from j*j (you don't have to mark multiples of j less than j^2 because they are already marked as multiples of previously found, smaller primes)

So, in the end you have done the loop 1000 times and the if part only for those j's that are primes.

Second reason is that with the sieve, you only do multiplication, not division. If you do it cleverly, you only do addition, not even multiplication.

And division has larger complexity than addition. The usual way to do division has O(n^2) complexity, while addition has O(n).

answered Oct 17 '22 20:10

ypercubeᵀᴹ

Related questions
                            
                                What is the safe way to fill multidimensional array using std::fill?
                            
                                The expected number of inversions--From Introduction to Algorithms by Cormen
                            
                                Facial recognition/merging software [closed]
                            
                                Cardinal direction algorithm in Java
                            
                                Bloom filter or cuckoo hashing?
                            
                                How does Java implement hash tables?
                            
                                Is it possible to enumerate computer programs?
                            
                                Finding the number missing in the sequence [duplicate]
                            
                                percentage difference between two text files
                            
                                Howto create combinations of several vectors without hardcoding loops in C++?
                            
                                Reduce number of points in line
                            
                                How is arctan implemented?
                            
                                Datastructure ambiguity
                            
                                How to think in recursive way?
                            
                                Java: How to remove elements from a list while iterating over/adding to it
                            
                                Unbiased random number generator using a biased one
                            
                                Determine whether the two classes are linearly separable (algorithmically in 2D)
                            
                                What would the Big O be of a nested for loop with an Any() inside it?
                            
                                Euclidean greatest common divisor for more than two numbers
                            
                                Modify a given number to find the required sum?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is Sieve of Eratosthenes more efficient than the simple "dumb" algorithm?

Tags:

performance

algorithm

big-o