Why ghc changes the evaluation way due to the optimisation flag?

Question

Hello, I've encountered a wired behaviour of the optimisation flags of ghc. The optimising flags seem to change the way of evaluation. In summary,

I wrote a code containing primes and isPrime defined by referring to the each other.
I found that the program works well with ghc -O3, but I could not use runhaskell to get the result. It costs too much time.
I noticed that when I used ghc -O1, the result appears instantly as -O3, but the executable compiled by ghc -O0 fails to calculate the result in a minute.
I used Debug.Trace.trace to find that primes is evaluated from its start every time when isPrime is called.
I moved the definition of primes and isPrime to another file Prime.hs. In the main file, I imported my Prime library. Unfortunately, the executable compiled by ghc -O3 does not calculate the result in a minute.

Here's the description. Please see the following code.

main :: IO ()
main = print $ length $ filter isPrime [100000..1000000]

primes :: Integral a => [a]
primes = 2 : filter isPrime [3,5..]

isPrime :: Integral a => a -> Bool
isPrime n = n > 1 && foldr (\p r -> p * p > n || (n `mod` p /= 0 && r)) True primes

When I compile the code with ghc -O3, the executable calculates the correct result 68906 in 2 seconds.

 $ ghc -O3 test.hs
[1 of 1] Compiling Main             ( test.hs, test.o )
Linking test ...
 $ time ./test
68906
./test  1.24s user 0.02s system 79% cpu 1.574 total

However, when I used -O0, I could not get the result in a minute. Be sure to remove the generated files in advance.

 $ rm -f ./test ./test.o ./test.hi
 $ ghc -O0 test.hs
[1 of 1] Compiling Main             ( test.hs, test.o )
Linking test ...
 $ time ./test
^C
./test  64.34s user 0.94s system 94% cpu 1:08.90 total

I aborted. The flag -O1 works well as same as -O3.

So let us dive into investigation. I used Debug.Trace.trace. I traced the argument of isPrime.

import Debug.Trace

main :: IO ()
main = print $ length $ filter isPrime [10..30]

primes :: (Show a, Integral a) => [a]
primes = 2 : filter isPrime [3,5..]

isPrime :: (Show a, Integral a) => a -> Bool
isPrime n = trace (show n) $ n > 1 && foldr (\p r -> p * p > n || (n `mod` p /= 0 && r)) True primes

When the optimisation flag is -O3, (or -O1), the output is as follows.

This result is reasonable (note that the last line prints the number of the primes; 11, 13, 17, 19, 23, 29).

Here's the result with -O0 (or runhaskell)

This result is interesting to look into. 2 is already arranged at the head of primes. 3 and 5 are checked if isPrime again and again. When isPrime 11 is called, 3 is checked if a prime, and 5 is also checked, isPrime 3 is called again. Likewise, for almost every odd numbers, isPrime 3 and isPrime 5 is called again and again.

Thus I thought that when I use -O0, primes is not cached and constructed from [2] every time as isPrime is called. So the first question is why -O0 and -O1 changes the behavior of evaluation.

Here's another problem. Okay, okay, but I rarely use -O0 flag. In most case I use -O2 or -O3 optimisation flag so I thought that the above problem does not appear in many use case.

But when I moved the codes into another file, the problem again turns up. I just moved primes and isPrime to Prime.hs.

test.hs:

import Prime

main :: IO ()
main = print $ length $ filter isPrime [100000..1000000]

Prime.hs:

module Prime where

primes :: Integral a => [a]
primes = 2 : filter isPrime [3,5..]

isPrime :: Integral a => a -> Bool
isPrime n = n > 1 && foldr (\p r -> p * p > n || (n `mod` p /= 0 && r)) True primes

In this time, I could not obtain the result with -O1 flag, or even with -O3 flag.

 $ ghc -O3 test.hs
[1 of 2] Compiling Prime            ( Prime.hs, Prime.o )
[2 of 2] Compiling Main             ( test.hs, test.o )
Linking test ...
 $ time ./test
^C
./test  62.41s user 0.88s system 92% cpu 1:08.23 total

hmm, I aborted again. I do not know whether this way has an effect to the result, I precompiled Prime.hs with -O3 in advance, but in vain. I hereby used Debug.Trace.trace and I saw 2 and 3 again and again with -O3 flag. In short, I could not create a Prime library because the evaluation way changes when primes and isPrime are moved into a module (which made me surprised) and -O3 does not make it work.

So the second question is, in spite of the -O3 flag, why the stuffs in a module are evaluated as compiled by -O0 flag?

I finally get tired of investigating into this wired behaviour. I concluded that I should not use a cross-referenced definition in a module. I gave up creating my Prime library and started to use Data.Numbers.Primes.

Thanks in advance.

Carl · Accepted Answer

What's going on here is in the following signature:

primes :: Integral a => [a]

The type class prevents primes from being memoized naively. primes :: [Int] is not the same as primes :: [Integer]. And no calculations can be shared, because GHC can't assume all instances of Num follow the same logic. Because of that, every use of primes ends up recalculating the list at the selected type.

But when you enable optimizations, GHC gets a fair bit smarter. When the only use of primes is in the same module as the definition, GHC can optimize it down to the concrete type it's used as. Then calculations are shared across uses of the list.

It only does this within module boundaries, though. Separate compilation of modules means that if primes is exported, it can't be specialized to a concrete type - GHC never knows if the next module it will compile might use primes at a different type.

The simplest way to resolve this is to give primes a concrete type. Then even the naive use of it memoizes.

Why ghc changes the evaluation way due to the optimisation flag?

Tags:

haskell

ghc

itchyny

1 Answers

Carl

Recent Activity

Donate For Us

Why ghc changes the evaluation way due to the optimisation flag?

Tags:

haskell

ghc

itchyny

1 Answers

Carl

Related questions

Recent Activity

Donate For Us