Hello, I've encountered a wired behaviour of the optimisation flags of ghc. The optimising flags seem to change the way of evaluation. In summary,
primes
and isPrime
defined by referring to the each other.ghc -O3
, but I could not use runhaskell
to get the result. It costs too much time.ghc -O1
, the result appears instantly as -O3
, but the executable compiled by ghc -O0
fails to calculate the result in a minute.Debug.Trace.trace
to find that primes
is evaluated from its start every time when isPrime
is called.primes
and isPrime
to another file Prime.hs
. In the main file, I imported my Prime library. Unfortunately, the executable compiled by ghc -O3
does not calculate the result in a minute.Here's the description. Please see the following code.
main :: IO ()
main = print $ length $ filter isPrime [100000..1000000]
primes :: Integral a => [a]
primes = 2 : filter isPrime [3,5..]
isPrime :: Integral a => a -> Bool
isPrime n = n > 1 && foldr (\p r -> p * p > n || (n `mod` p /= 0 && r)) True primes
When I compile the code with ghc -O3
, the executable calculates the correct result 68906
in 2 seconds.
$ ghc -O3 test.hs
[1 of 1] Compiling Main ( test.hs, test.o )
Linking test ...
$ time ./test
68906
./test 1.24s user 0.02s system 79% cpu 1.574 total
However, when I used -O0
, I could not get the result in a minute. Be sure to remove the generated files in advance.
$ rm -f ./test ./test.o ./test.hi
$ ghc -O0 test.hs
[1 of 1] Compiling Main ( test.hs, test.o )
Linking test ...
$ time ./test
^C
./test 64.34s user 0.94s system 94% cpu 1:08.90 total
I aborted. The flag -O1
works well as same as -O3
.
So let us dive into investigation. I used Debug.Trace.trace
. I traced the argument of isPrime
.
import Debug.Trace
main :: IO ()
main = print $ length $ filter isPrime [10..30]
primes :: (Show a, Integral a) => [a]
primes = 2 : filter isPrime [3,5..]
isPrime :: (Show a, Integral a) => a -> Bool
isPrime n = trace (show n) $ n > 1 && foldr (\p r -> p * p > n || (n `mod` p /= 0 && r)) True primes
When the optimisation flag is -O3
, (or -O1
), the output is as follows.
10
11
3
5
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
7
30
6
This result is reasonable (note that the last line prints the number of the primes; 11, 13, 17, 19, 23, 29).
Here's the result with -O0
(or runhaskell
)
10
11
3
5
3
12
13
3
5
3
14
15
3
16
17
3
5
3
18
19
3
5
3
20
21
3
22
23
3
5
3
24
25
3
5
3
26
27
3
28
29
3
5
3
7
3
30
6
This result is interesting to look into. 2 is already arranged at the head of primes
. 3 and 5 are checked if isPrime
again and again. When isPrime 11
is called, 3 is checked if a prime, and 5 is also checked, isPrime 3
is called again. Likewise, for almost every odd numbers, isPrime 3
and isPrime 5
is called again and again.
Thus I thought that when I use -O0
, primes
is not cached and constructed from [2]
every time as isPrime
is called. So the first question is why -O0
and -O1
changes the behavior of evaluation.
Here's another problem. Okay, okay, but I rarely use -O0
flag. In most case I use -O2
or -O3
optimisation flag so I thought that the above problem does not appear in many use case.
But when I moved the codes into another file, the problem again turns up. I just moved primes
and isPrime
to Prime.hs.
test.hs:
import Prime
main :: IO ()
main = print $ length $ filter isPrime [100000..1000000]
Prime.hs:
module Prime where
primes :: Integral a => [a]
primes = 2 : filter isPrime [3,5..]
isPrime :: Integral a => a -> Bool
isPrime n = n > 1 && foldr (\p r -> p * p > n || (n `mod` p /= 0 && r)) True primes
In this time, I could not obtain the result with -O1
flag, or even with -O3
flag.
$ ghc -O3 test.hs
[1 of 2] Compiling Prime ( Prime.hs, Prime.o )
[2 of 2] Compiling Main ( test.hs, test.o )
Linking test ...
$ time ./test
^C
./test 62.41s user 0.88s system 92% cpu 1:08.23 total
hmm, I aborted again. I do not know whether this way has an effect to the result, I precompiled Prime.hs with -O3
in advance, but in vain. I hereby used Debug.Trace.trace
and I saw 2 and 3 again and again with -O3
flag. In short, I could not create a Prime library because the evaluation way changes when primes
and isPrime
are moved into a module (which made me surprised) and -O3
does not make it work.
So the second question is, in spite of the -O3
flag, why the stuffs in a module are evaluated as compiled by -O0
flag?
I finally get tired of investigating into this wired behaviour. I concluded that I should not use a cross-referenced definition in a module. I gave up creating my Prime library and started to use Data.Numbers.Primes
.
Thanks in advance.
What's going on here is in the following signature:
primes :: Integral a => [a]
The type class prevents primes
from being memoized naively. primes :: [Int]
is not the same as primes :: [Integer]
. And no calculations can be shared, because GHC can't assume all instances of Num
follow the same logic. Because of that, every use of primes
ends up recalculating the list at the selected type.
But when you enable optimizations, GHC gets a fair bit smarter. When the only use of primes
is in the same module as the definition, GHC can optimize it down to the concrete type it's used as. Then calculations are shared across uses of the list.
It only does this within module boundaries, though. Separate compilation of modules means that if primes
is exported, it can't be specialized to a concrete type - GHC never knows if the next module it will compile might use primes
at a different type.
The simplest way to resolve this is to give primes
a concrete type. Then even the naive use of it memoizes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With