I'm very new to Haskell and I'm just trying to find the sum of the first 2 million primes. I'm trying to generate the primes using a sieve (I think the sieve of Eratosthenes?), but it's really really slow and I don't know why. Here is my code.
sieve (x:xs) = x:(sieve $ filter (\a -> a `mod` x /= 0) xs)
ans = sum $ takeWhile (<2000000) (sieve [2..])
Thanks in advance.
Here's a sieve of Eratosthenes:
P = {3,5, ...} \ ⋃ {{p2, p2+2p, ...} | p in P}
(without the 2). :) Or in "functional" i.e. list-based Haskell,
primes = 2 : g (fix g) where
g xs = 3 : (gaps 5 $ unionAll [[p*p, p*p+2*p..] | p <- xs])
unionAll ((x:xs):t) = x : union xs (unionAll $ pairs t) where
pairs ((x:xs):ys:t) = (x : union xs ys) : pairs t
fix g = xs where xs = g xs
union (x:xs) (y:ys) = case compare x y of LT -> x : union xs (y:ys)
EQ -> x : union xs ys
GT -> y : union (x:xs) ys
gaps k s@(x:xs) | k<x = k:gaps (k+2) s
| True = gaps (k+2) xs
Compared with the trial division code in the answer by augustss, it is 1.9x times faster at generating 200k primes, and 2.1x faster at 400k, with empirical time complexity of O(n^1.12..1.15)
vs O(n^1.4)
, on said range. It is 2.6x times faster at generating 1 mln primes.
Because it opens up multiples-filtering streams for each prime too early, and so ends up with too many of them. We don't need to filter by a prime until its square is seen in the input.
Seen under a stream processing paradigm, sieve (x:xs) = x:sieve [y|y<-xs, rem y p/=0]
can be seen as creating a pipeline of stream transducers behind itself as it is working:
[2..] ==> sieve --> 2
[3..] ==> nomult 2 ==> sieve --> 3
[4..] ==> nomult 2 ==> nomult 3 ==> sieve
[5..] ==> nomult 2 ==> nomult 3 ==> sieve --> 5
[6..] ==> nomult 2 ==> nomult 3 ==> nomult 5 ==> sieve
[7..] ==> nomult 2 ==> nomult 3 ==> nomult 5 ==> sieve --> 7
[8..] ==> nomult 2 ==> nomult 3 ==> nomult 5 ==> nomult 7 ==> sieve
where nomult p = filter (\y->rem y p/=0)
. But 8 doesn't need to be checked for divisibility by 3 yet, as it is smaller than 3^2 == 9
, let alone by 5 or 7.
This is the single most serious problem with that code, although it is dismissed as irrelevant right at the start of that article which everybody mention. Fixing it by postponing the creation of filters achieves dramatic speedups.
It is very slow because the algorithm is a trial division that doesn't stop at the square root.
If you look closely what the algorithm does, you see that for each prime p
, its multiples that have no smaller prime divisors are removed from the list of candidates (the multiples with smaller prime divisors were removed previously).
So each number is divided by all primes until either it is removed as a multiple of its smallest prime divisor or it appears at the head of the list of remaining candidates if it is a prime.
For the composite numbers, that isn't particularly bad, since most composite numbers have small prime divisors, and in the worst case, the smallest prime divisor of n
doesn't exceed √n
.
But the primes are divided by all smaller primes, so until the kth prime is found to be prime, it has been divided by all k-1
smaller primes. If there are m
primes below the limit n
, the work needed to find all of them prime is
(1-1) + (2-1) + (3-1) + ... + (m-1) = m*(m-1)/2
divisions. By the Prime number theorem, the number of primes below n
is asymptotically n / log n
(where log
denotes the natural logarithm). The work to eliminate the composites can crudely be bounded by n * √n
divisions, so for not too small n
that is negligible in comparison to the work spent on the primes.
For the primes to two million, the Turner sieve needs roughly 1010 divisions. Additionally, it needs to deconstruct and reconstruct a lot of list cells.
A trial division that stops at the square root,
isPrime n = go 2
where
go d
| d*d > n = True
| n `rem` d == 0 = False
| otherwise = go (d+1)
primes = filter isPrime [2 .. ]
would need fewer than 1.9*109 divisions (brutal estimate if every isPrime n
check went to √n
- actually, it takes only 179492732 because composites are generally cheap)(1) and much fewer list operations. Additionally, this trial division is easily improvable by skipping even numbers (except 2
) as candidate divisors, which halves the number of required divisions.
A sieve of Eratosthenes doesn't need any divisions and uses only O(n * log (log n))
operations, that is quite a bit faster:
primeSum.hs
:
module Main (main) where
import System.Environment (getArgs)
import Math.NumberTheory.Primes
main :: IO ()
main = do
args <- getArgs
let lim = case args of
(a:_) -> read a
_ -> 1000000
print . sum $ takeWhile (<= lim) primes
And running it for a limit of 10 million:
$ ghc -O2 primeSum && time ./primeSum 10000000
[1 of 1] Compiling Main ( primeSum.hs, primeSum.o )
Linking primeSum ...
3203324994356
real 0m0.085s
user 0m0.084s
sys 0m0.000s
We let the trial division run only to 1 million (fixing the type as Int
):
$ ghc -O2 tdprimeSum && time ./tdprimeSum 1000000
[1 of 1] Compiling Main ( tdprimeSum.hs, tdprimeSum.o )
Linking tdprimeSum ...
37550402023
real 0m0.768s
user 0m0.765s
sys 0m0.002s
And the Turner sieve only to 100000:
$ ghc -O2 tuprimeSum && time ./tuprimeSum 100000
[1 of 1] Compiling Main ( tuprimeSum.hs, tuprimeSum.o )
Linking tuprimeSum ...
454396537
real 0m2.712s
user 0m2.703s
sys 0m0.005s
(1) The brutal estimate is
2000000
∑ √k ≈ 4/3*√2*10^9
k = 1
evaluated to two significant digits. Since most numbers are composites with a small prime factor - half the numbers are even and take only one division - that vastly overestimates the number of divisions required.
A lower bound for the number of required divisions would be obtained by considering primes only:
∑ √p ≈ 2/3*N^1.5/log N
p < N
p prime
which, for N = 2000000
gives roughly 1.3*108. That is the right order of magnitude, but underestimates by a nontrivial factor (decreasing slowly to 1 for growing N
, and never greater than 2 for N > 10
).
Besides the primes, also the squares of primes and the products of two close primes require the trial division to go up to (nearly) √k
and hence contribute significantly to the overall work if there are sufficiently many.
The number of divisions needed to treat the semiprimes is however bounded by a constant multiple of
N^1.5/(log N)^2
so for very large N
it becomes negligible relative to the cost of treating primes. But in the range where trial division is feasible at all, they still contribute significantly.
What you did is not the Sieve of Eratosthenes; it's trial division (note the mod operator). Here's my version of the Sieve of Eratosthenes:
import Control.Monad (forM_, when)
import Control.Monad.ST
import Data.Array.ST
import Data.Array.Unboxed
sieve :: Int -> UArray Int Bool
sieve n = runSTUArray $ do
let m = (n-1) `div` 2
r = floor . sqrt $ fromIntegral n
bits <- newArray (0, m-1) True
forM_ [0 .. r `div` 2 - 1] $ \i -> do
isPrime <- readArray bits i
when isPrime $ do
forM_ [2*i*i+6*i+3, 2*i*i+8*i+6 .. (m-1)] $ \j -> do
writeArray bits j False
return bits
primes :: Int -> [Int]
primes n = 2 : [2*i+3 | (i, True) <- assocs $ sieve n]
You can run it at http://ideone.com/mu1RN.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With