Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sieve of Eratosthenes algorithm in JavaScript running endless for large number

Tags:

I have been trying to write Sieve of Eratosthenes algorithm in JavaScript. Basically I just literally followed the steps below:

  1. Create a list of consecutive integers from 2 to (n-1)
  2. Let first prime number p equal 2
  3. Starting from p, count up in increments of p and removes each of these numbers (p and multiples of p)
  4. Go to the next number in the list and repeat 2,3,4
  5. Add unintentionally deleted prime numbers back to the list

and this is what I have come up with:

function eratosthenes(n){ var array = []; var tmpArray = []; // for containing unintentionally deleted elements like 2,3,5,7,... var maxPrimeFactor = 0; var upperLimit = Math.sqrt(n); var output = [];  // Eratosthenes algorithm to find all primes under n  // Make an array from 2 to (n - 1) //used as a base array to delete composite number from for(var i = 2; i < n; i++){     array.push(i); }  // Remove multiples of primes starting from 2, 3, 5,... for(var i = array[0]; i < upperLimit; i = array[0]){     removeMultiples:      for(var j = i, k = i; j < n; j += i){         var index = array.indexOf(j);         if(index === -1)             continue removeMultiples;         else             array.splice(index,1);     }     tmpArray.push(k); } array.unshift(tmpArray); return array; } 

It works for small numbers but not for numbers larger than one million. I used Node.js to test and the process just seems endless and no memory error shown up. I've read a solution here(also in javascript) but still cannot fully comprehend it.

Question: How to make this work for sufficiently large numbers such as one million and above?

like image 869
Bao Avatar asked Mar 18 '13 07:03

Bao


People also ask

How do you use Sieve of Eratosthenes to find prime numbers?

The Sieve of Eratosthenes method is easy to use. We need to cancel all the multiples of each prime number beginning with 2 (including the number 1, which is not a prime or composite) and encircle the rest of the numbers. The encircled numbers will be the required prime numbers.

What is the time complexity of the classic sieve?

The classical Sieve of Eratosthenes algorithm takes O(N log (log N)) time to find all prime numbers less than N.

Why Sieve of Eratosthenes algorithm is used in Python?

Sieve of Eratosthenes is a method for finding all primes up to (and possibly including) a given natural. This method works well when is relatively small, allowing us to determine whether any natural number less than or equal to is prime or composite.

Who designed the sieve for finding prime numbers?

The sieve of Eratosthenes is one of the most efficient ways to find all primes smaller than n when n is smaller than 10 million or so (Ref Wiki).


2 Answers

You are making the Sieve of Eratosthenes much slower by making use of array manipulation functions such as Array#indexOf and Array#splice which runs in linear time. When you can have O(1) for both operations involved.

Below is the Sieve of Eratosthenes following conventional programming practices:

var eratosthenes = function(n) {     // Eratosthenes algorithm to find all primes under n     var array = [], upperLimit = Math.sqrt(n), output = [];      // Make an array from 2 to (n - 1)     for (var i = 0; i < n; i++) {         array.push(true);     }      // Remove multiples of primes starting from 2, 3, 5,...     for (var i = 2; i <= upperLimit; i++) {         if (array[i]) {             for (var j = i * i; j < n; j += i) {                 array[j] = false;             }         }     }      // All array[i] set to true are primes     for (var i = 2; i < n; i++) {         if(array[i]) {             output.push(i);         }     }      return output; }; 

You can see a live example for n = 1 000 000 here.

like image 76
Alexander Avatar answered Sep 19 '22 16:09

Alexander


This question is a little stingy on the low side in the definition of what a "large number" is and accepts that it starts at only about a million, for which the current answer works; however, it uses quite a lot of memory as in one 8-byte number (a double real of 64 bits) for each element to be sieved and another 8-byte number for every found prime. That answer wouldn't work for "large numbers" of say about 250 million and above as it would exceed the amount of memory available to the JavaScript execution machine.

The following JavaScript code implementing the "infinite" (unbounded) Page Segmented Sieve of Eratosthenes overcomes that problem in that it only uses one bit-packed 16 Kilobyte page segmented sieving buffer (one bit represents one potential prime number) and only uses storage for the base primes up to the square root of the current highest number in the current page segment, with the actual found primes enumerated in order without requiring any storage; also saving time by only sieving odd composites as the only even prime is 2:

var SoEPgClass = (function () {   function SoEPgClass() {     this.bi = -1; // constructor resets the enumeration to start...   }   SoEPgClass.prototype.next = function () {     if (this.bi < 1) {       if (this.bi < 0) {         this.bi++;         this.lowi = 0; // other initialization done here...         this.bpa = [];         return 2;       } else { // bi must be zero:         var nxt = 3 + 2 * this.lowi + 262144; //just beyond the current page         this.buf = [];         for (var i = 0; i < 2048; i++) this.buf.push(0); // faster initialization 16 KByte's:         if (this.lowi <= 0) { // special culling for first page as no base primes yet:           for (var i = 0, p = 3, sqr = 9; sqr < nxt; i++, p += 2, sqr = p * p)             if ((this.buf[i >> 5] & (1 << (i & 31))) === 0)               for (var j = (sqr - 3) >> 1; j < 131072; j += p)                 this.buf[j >> 5] |= 1 << (j & 31);         } else { // other than the first "zeroth" page:           if (!this.bpa.length) { // if this is the first page after the zero one:             this.bps = new SoEPgClass(); // initialize separate base primes stream:             this.bps.next(); // advance past the only even prime of 2             this.bpa.push(this.bps.next()); // keep the next prime (3 in this case)           }           // get enough base primes for the page range...           for (var p = this.bpa[this.bpa.length - 1], sqr = p * p; sqr < nxt;             p = this.bps.next(), this.bpa.push(p), sqr = p * p);           for (var i = 0; i < this.bpa.length; i++) { //for each base prime in the array             var p = this.bpa[i];             var s = (p * p - 3) >> 1; //compute the start index of the prime squared             if (s >= this.lowi) // adjust start index based on page lower limit...               s -= this.lowi;             else { //for the case where this isn't the first prime squared instance               var r = (this.lowi - s) % p;               s = (r != 0) ? p - r : 0;             }             //inner tight composite culling loop for given prime number across page             for (var j = s; j < 131072; j += p) this.buf[j >> 5] |= 1 << (j & 31);           }         }       }     }     //find next marker still with prime status     while (this.bi < 131072 && this.buf[this.bi >> 5] & (1 << (this.bi & 31))) this.bi++;     if (this.bi < 131072) // within buffer: output computed prime       return 3 + ((this.lowi + this.bi++) * 2);     else { // beyond buffer range: advance buffer       this.bi = 0;       this.lowi += 131072;       return this.next(); // and recursively loop just once to make a new page buffer     }   };   return SoEPgClass; })(); 

The above code can be utilized to count the primes up to the given limit by the following JavaScript code:

window.onload = function () {   var elpsd = -new Date().getTime();   var top_num = 1000000000;   var cnt = 0;   var gen = new SoEPgClass();   while (gen.next() <= top_num) cnt++;   elpsd += (new Date()).getTime();   document.getElementById('content')     .innerText = 'Found ' + cnt + ' primes up to ' + top_num + ' in ' + elpsd + ' milliseconds.'; }; 

If the above two pieces of JavaScript code are put into a file named app.js in the same folder as the following HTML code named whatever.html, you will be able to run the code in your browser by opening the HTML file in it:

<!DOCTYPE html>  <html lang="en">   <head>     <meta charset="utf-8" />     <title>Page Segmented Sieve of Eratosthenes in JavaScript</title>     <script src="app.js"></script>   </head>   <body>     <h1>Page Segmented Sieve of Eratosthenes in JavaScript.</h1>      <div id="content"></div>   </body> </html> 

This code can sieve to the one billion range is a few 10's of seconds when run on a JavaScript execution engine using Just-In-Time (JIT) compilation such as Google Chrome's V8 engine. Further gains can be achieved by using extreme wheel factorization and pre-culling of the page buffers of the lowest base primes in which case the amount of work performed can be cut by a further factor of four, meaning that the number of primes can be counted up to a billion in a few seconds (counting does not require enumeration as used here but rather can use bit count techniques on the page segment buffers directly), although at the cost of increased code complexity.

EDIT_ADD:

The execution speed can be sped up by a factor of three or more by using TypedArray and the asm.js optimizations from ECMAScript 2015 (now supported in all common browsers) with the code modifications as follows:

"use strict"; var SoEPgClass = (function () {   function SoEPgClass() {     this.bi = -1; // constructor resets the enumeration to start...     this.buf = new Uint8Array(16384);   }   SoEPgClass.prototype.next = function () {     if (this.bi < 1) {       if (this.bi < 0) {         this.bi++;         this.lowi = 0; // other initialization done here...         this.bpa = [];         return 2;       } else { // bi must be zero:         var nxt = 3 + 2 * this.lowi + 262144; // just beyond the current page         for (var i = 0; i < 16384; ++i) this.buf[i] = 0 >>> 0; // zero buffer         if (this.lowi <= 0) { // special culling for first page as no base primes yet:           for (var i = 0, p = 3, sqr = 9; sqr < nxt; ++i, p += 2, sqr = p * p)             if ((this.buf[i >> 3] & (1 << (i & 7))) === 0)               for (var j = (sqr - 3) >> 1; j < 131072; j += p)                 this.buf[j >> 3] |= 1 << (j & 7);         } else { // other than the first "zeroth" page:           if (!this.bpa.length) { // if this is the first page after the zero one:             this.bps = new SoEPgClass(); // initialize separate base primes stream:             this.bps.next(); // advance past the only even prime of 2             this.bpa.push(this.bps.next()); // keep the next prime (3 in this case)           }           // get enough base primes for the page range...           for (var p = this.bpa[this.bpa.length - 1], sqr = p * p; sqr < nxt;             p = this.bps.next(), this.bpa.push(p), sqr = p * p);           for (var i = 0; i < this.bpa.length; ++i) { // for each base prime in the array             var p = this.bpa[i] >>> 0;             var s = (p * p - 3) >>> 1; // compute the start index of the prime squared             if (s >= this.lowi) // adjust start index based on page lower limit...               s -= this.lowi;             else { // for the case where this isn't the first prime squared instance               var r = (this.lowi - s) % p;               s = (r != 0) ? p - r : 0;             }             if (p <= 8192) {               var slmt = Math.min(131072, s + (p << 3));               for (; s < slmt; s += p) {                 var msk = (1 >>> 0) << (s & 7);                 for (var j = s >>> 3; j < 16384; j += p) this.buf[j] |= msk;               }             }             else               // inner tight composite culling loop for given prime number across page               for (var j = s; j < 131072; j += p) this.buf[j >> 3] |= (1 >>> 0) << (j & 7);           }         }       }     }     //find next marker still with prime status     while (this.bi < 131072 && this.buf[this.bi >> 3] & ((1 >>> 0) << (this.bi & 7)))       this.bi++;     if (this.bi < 131072) // within buffer: output computed prime       return 3 + ((this.lowi + this.bi++) << 1);     else { // beyond buffer range: advance buffer       this.bi = 0;       this.lowi += 131072;       return this.next(); // and recursively loop just once to make a new page buffer     }   };   return SoEPgClass; })(); 

The speed up works because it uses the pre-typed ECMAScript primitive arrays to avoid overheads by directly using integers in the arrays (also avoiding wasting space by using float representations) and also uses the type hints available using asm.js in causing bit manipulations to use unsigned integers/bytes. as well, to save the time for array allocations, it now allocations the sieving array once and just zeros it for each new page segment. It now sieves to a billion in about 16 seconds on a low end 1.92 Gigahertz CPU rather than about 50 seconds. As well, the algorithm is modified to simplify the inner composite number representation (in bit packed bits) for extra speed for smaller primes, which is the majority of the culling operations.

Note that now about 60% of the expended time is now spent in just enumerating the found primes. This could be greatly reduced for the usual use of such a sieve to just count the found primes by just summing the number of zero bits in the array for each segment page. If this was done, the time to sieve to a billion would be something close to 7 seconds on this low end CPU and there may be another few optimizations possible (all timing using the Google Chrome version 72 V8 JavaScript engine which is constantly being improved and later versions may run faster).

TBH, I personally dislike JavaScript with all its extensions and complexities that have been necessary to make it a "modern" language and in particular don't like dynamic typing, so embraced Microsoft's TypeScript when it appeared some years ago. The above code is actually a modification of the code as output from TypeScript along with its emphasis on statically typed Object Oriented Programming (OOP). It occurred to me that the calling of the "next" instance method through the standard way of adding methods to "prototype" may be much slower than just calling a function so I tested it and found that to be exactly the case, with this runnable link enumerating the found primes about two and a half times faster just by changing the enumeration to a simple output closure function.

Now, we can eliminate the primes enumeration entirely by just counting the number of found primes as shown in this other runnable link with modified code, showing that even with the above improvement, enumeration of the found primes still costs almost as much time as actually doing the sieving with this algorithm with one able to determine the enumeration time as the difference between the run time for the above two links to runnable code.

Note that the run times for the links will be different (and likely shorter) than as I mention here as most current CPU's will be faster and more powerful than the tablet Windows CPU I am currently using (the Intel x5-Z3850 at 1.92 Gigahertz and the JavaScript gets run on the machine on which you are viewing the link.

This then makes JavaScript only a little slower than the same algorithm implemented on the JVM or DotNet, which is of course still much slower than the highly optimized native code as compiled from languages such as C/C++, Rust, Nim, Haskell, Swift, FreePascal, Julia, etc. which can run this algorithm in about two seconds on this low end CPU. WebAssembly can run this algorithm up to about two to three times faster than the JavaScript here, depending on the browser implementation; as well, when the WebAssembly specification is fully complete and implemented, we will have multi-threading support for further gains by the factor of the number of effective cores used.

END_EDIT_ADD

EDIT_ADD_MORE_AND_LATER_EVEN_MORE:

Once the above fairly small modifications are done to just count the found primes efficiently rather then enumerate them thus making the counting time a small overhead as compared to sieving them, it would be worth while to make the more extensive changes to use maximum wheel factorization (by not only 2 for "odds-only", but also 3, 5, and 7 for a wheel that covers a span of 210 potential primes) and also pre-culling on initialization of the small sieving arrays so that it is not necessary to cull by the following primes of 11, 13, 17, and 19. This reduces the number of composite number culling operations when using the page segmented sieve by a factor of about four to a range of a billion and can be written so that it runs about four times as fast due to the reduced operations with each culling operation about the same speed as for the above code.

The way of doing the 210-span wheel factorization efficiently is to follow on this method of doing the "odds-only" sieving efficiently: the current algorithm above can be thought of as sieving one bit-packed plane out of two where the other plane can be eliminated as it contains only the even numbers above two; for the 210-span we can define 48 bit-packed arrays of this size representing possible primes of 11 and above where all the other 162 planes contain numbers which are factors of two, three, five, or seven and thus don't need to be considered. In this way it is just as efficient to sieve with less memory requirements (by over a half as compared to "odds-only" and as much efficiency as here, where one 48-plane "page" represents 16 Kilobytes = 131072 bits per plane times 210 which is a range of 27,525,120 numbers per sieve page segment, thus only 40 page segments to sieve to a billion (instead of almost four thousand as above), and therefore less overhead in start address calculation per base prime per page segment for a further gain in efficiency.

Although the extended code described above is a few hundred lines and long to post here, it can count the number of primes to a billion in under two seconds on my low end Intel 1.92 Gigahertz CPU using the Google V8 JavaScript engine, which is about four to five times slower than the same algorithm run in native code. This is about the limit of what we can do in JavaScript, with further advanced techniques of "loop-unrolling" and (of course) multi-processing unavailable. However, it is enough to almost match the hand-optimized reference C implementation of the Sieve of Atkin on this low-end CPU, which runs at about 1.4 seconds.

ADDED: I have explained this in even better detail with a runnable snippet in another StackOverflow answer, and in cross-linked other answers to that thread.

Although the above code is quite efficient up to a range of about 16 billion, other improvements can help maintain the efficiency to even larger ranges of several tens of thousands of billions such that one could count the number of primes up to about 1e14 in a few days using JavaScript on a faster CPU. This is interesting as the number of primes to this range wasn't known until 1985 and then was determined by a numerical analysis technique as the computers of that day weren't powerful enough to run the Sieve of Eratosthenes fast enough for that range in a reasonable time.

With my current "anti-JavaScript" and pro functional coding style bias, I would write this code using Fable, which is a implementation of F# (statically typed ML "functional" language that also supports OOP, if desired) that transpiles to JavaScript very efficiently such that the generated code is likely to be about as fast as if it was written directly in JavaScript.

To show that the code can run almost as fast in Chrome V8 JavaScript engine using Fable (with the Elmish React interface) as writing pure JavaScript as in the last link above here is a link to a Fable online IDE containing the above algorithm. It runs very slightly slower than pure JavaScript, and the JavaScript output "Code" view shows why: the code generated for Tail Call Optimizations (TCO) isn't quite a simple loop as for the JavaScript - it would be easy to hand tune the code just for the tight inner culling loops to get the same speed. The code is written in functional style except for the array content mutation and as necessary for the sequence generator functions, which are in the same form as the JavaScript for easy understanding; it would work about as fast if this streaming generator part of the code were written to use F# sequences, with no visible mutation.

Since the above Fable code is pure F#, it could also run with the Fabulous library as a JavaScript generator from DotNet Core, or could run multi-platform and a little faster by directly running it under DotNet Core.

END_EDIT_ADD_MORE_AND_EVEN_MORE

In summary, there are all sorts of algorithms that can find the primes to a few million in the order of a second, but it takes an efficient page segmented array based Sieve of Eratosthenes algorithm to determine the primes to billions in that order of execution time.

like image 37
GordonBGood Avatar answered Sep 16 '22 16:09

GordonBGood