Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficiency of Sieve of Eratosthenes algorithm

I am trying to understand the "Sieve of Eratosthenes". Here is my algorithm (code below), and a list of features that I cannot understand (in order).

  1. Why is i * i more efficient than i * 2? Yes, I can understand it would be less iterations, therefore more efficient, but then doesn't it skip some numbers (for example i = 9 => j = 81 skips 18 27 36 ...)?
  2. On Wikipedia I found that space complexity is equal to O(n) and that's understandable; whatever number we enter it creates an array of the size entered, but time complexity here is where things get confusing. I found this notation O(n(logn)(loglogn)) -- what is that? According to my understanding we have 2 full iterations and 1 partial iteration, therefore O(n^2 * logn).
#include <iostream>
using namespace std;

int main() {
  cout << "Enter number:" << endl;

  int arrSize;
  cin >> arrSize;

  bool primesArr[arrSize];
  primesArr[0] = false;
  for (int i = 1; i < arrSize; i++) primesArr[i] = true;

  for (int i = 2; i < arrSize; i++)

    if (primesArr[i - 1]) {
      cout << i << endl;
   /* for (int j = i * 2; j < arrSize; j += i) less efficient */
      for (int j = i * i; j < arrSize; j += i)
        primesArr[j - 1] = false;
    }

  return 0;
}
like image 312
Ramazan Chasygov Avatar asked Dec 23 '22 10:12

Ramazan Chasygov


2 Answers

Why i * i more efficient than i * 2? Yes, I can understand it would be less iteration, therefore more efficiency, but then doesn't it skip some numbers (for example i = 9 => j = 81 skip 18 27 36 ...)?

You are referring to

for (int j = i * i; j < arrSize; j += i)

Note that i * i is the initial value for the loop counter j. So the values of j greater than i * i will all be marked off. The values which we skip from i * 2 to i * i have already been marked off during previous iterations. Let's think about the first few:

When i == 2, we mark off all multiples of 2 (2, 4, 6, 8, etc.). When i == 3, if we start j = 3 * 2 = 6 then we will mark off 6 again before reaching 9, 12, 15, etc. Since 6 is a multiple of 2 and was already marked off, we can skip straight to 3 * 3 == 9.

When we reach i == 5 and if we start at j == 5 * 2 == 10, then we will mark off 10, which was already taken care of since it is a multiple of 2, 15 which is a multiple of 3, and 20 which is also a multiple of 2 before we finally reach 25 which is not a multiple of any primer less than 5.

time complexity here is where things get confusing. I found this notation O(n(logn)(loglogn)) -- what is that? According to my understanding we have 2 full iterations and 1 partial iteration, therefore O(n^2 * logn).

Your analysis reaches correct result that this algorithm is O(n^2 * logn). A more detailed analysis can prove a tighter upper bound as O(n(logn)(loglogn)). Note that O(n(logn)(loglogn)) is a subset of O(n^2 * logn).

like image 168
Code-Apprentice Avatar answered Dec 25 '22 22:12

Code-Apprentice


Why i * i more efficient than i * 2? Doesn't it skip some numbers?

No it doesn't because smaller multiple of i (For example 18, 27 etc in your case are covered while running loop for i = 2, i = 3 etc)

Every number can be represented as unique prime factorization. If i is a prime number, any multiple of i greater than i and smaller than i * i would be multiple of one or more primes smaller than i.

nasty notation O(n(logn)(loglogn))

From this answer
Number of operations are 1/2 + 1/3 + 1/5 + 1/7 ... = n log log n
If you count bit operations, since you're dealing with numbers up to n, they have about log n bits, which is where the factor of log n comes in, giving O(n log n log log n) bit operations.

like image 31
Mohit Jain Avatar answered Dec 26 '22 00:12

Mohit Jain