So I was working on a way to lazily generate primes, and I came up with these three definitions, which all work in an equivalent way - just checking whether each new integer has a factor among all the preceding primes:
primes1 :: [Integer]
primes1 = mkPrimes id [2..]
where mkPrimes f (x:xs) =
if f (const True) x
then
let g h y = y `mod` x > 0 && h y in
x : mkPrimes (f . g) xs
else
mkPrimes f xs
primes2 :: [Integer]
primes2 = mkPrimes id (const True) [2..]
where mkPrimes f f_ (x:xs) =
if f_ x
then
let g h y = y `mod` x > 0 && h y in
x : mkPrimes (f . g) ( f $ g $ const True) xs
else
mkPrimes f f_ xs
primes3 :: [Integer]
primes3 = mkPrimes [] [2..]
where mkPrimes ps (x:xs) =
if all (\p -> x `mod` p > 0) ps
then
x : mkPrimes (ps ++ [x]) xs
else
mkPrimes ps xs
So it seems to me primes2
should be a little faster than primes1
, since it avoids recomputing
f_ = f (const True)
for every integer (which I think requires work on the order of the number
of primes we've found thus far), and only updates it when we encounter a new prime.
Just from unscientific tests (running take 1000
in ghci) it seems like primes3
runs faster
than primes2
.
Should I take a lesson from this, and assume that if I can represent a function as an operation on an array, that I should implement it in the latter manner for efficiency, or is there something else going on here?
What's the second argument to the f
's needed for? In my opinion, both of these alternatives are more readable, and don't significantly impact performance...
...
let g y = f y && y `mod` x > 0 in
x : mkPrimes g xs
...
import Control.Arrow -- instance Monad (-> r)
import Control.Monad -- liftM2
(.&&.) = liftM2 (&&)
...
let g y = y `mod` x > 0 in
x : mkPrimes (f .&&. g) xs
...
Anyhow, back to the question. Sometimes using functions as data structures is the best representation for a certain task, and sometimes not. "Best" in terms of ease-of-coding and "best" in terms of performance are not always the same thing. The "functions as data structures" technique is essential to runtime compilation, but as that page warns,
Runtime compilation can sometimes win you significant efficiency gains, but can often win you almost nothing at the cost of the your increased stress and reduced productivity.
In your case, it's likely that the overhead of constructing each f :: Integer -> ... -> Bool
is significantly higher than the overhead of constructing each ps :: [Integer]
, with little or no difference when calling f ... x
versus all ... ps
.
To squeeze cycles out of the infinite prime sieve, get rid of the calls to mod
! Integer multiplication, division, and modulus are much slower than integer addition and subtraction. On my machine, this implementation clocks in at 40% faster when calculating the first 1000 primes (GHC 6.10.3 -O2
).
import qualified Data.Map as M
primes' :: [Integer]
primes' = mkPrimes 2 M.empty
where
mkPrimes n m = case (M.null m, M.findMin m) of
(False, (n', skips)) | n == n' ->
mkPrimes (succ n) (addSkips n (M.deleteMin m) skips)
_ -> n : mkPrimes (succ n) (addSkip n m n)
addSkip n m s = M.alter (Just . maybe [s] (s:)) (n+s) m
addSkips = foldl' . addSkip
In action (using a bit of JSON-ish syntax),
mkPrimes 2 {}
=> 2 : mkPrimes 3 {4: [2]}
=> 2 : 3 : mkPrimes 4 {4: [2], 6: [3]}
=> 2 : 3 : mkPrimes 5 {6: [2, 3]}
=> 2 : 3 : 5 : mkPrimes 6 {6: [2, 3], 10: [5]}
=> 2 : 3 : 5 : mkPrimes 7 {8: [2], 9: [3], 10: [5]}
=> 2 : 3 : 5 : 7 : mkPrimes 8 {8: [2], 9: [3], 10: [5], 14: [7]}
=> 2 : 3 : 5 : 7 : mkPrimes 9 {9: [3], 10: [2, 5], 14: [7]}
=> 2 : 3 : 5 : 7 : mkPrimes 10 {10: [2, 5], 12: [3], 14: [7]}
=> 2 : 3 : 5 : 7 : mkPrimes 11 {12: [2, 3], 14: [7], 15: [5]}
...
the map keeps track of future multiples, using nothing but addition.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With