I noticed that sometimes Haskell pure functions are somehow cached: if I call the function twice with the same parameters, the second time the result is computed in no time.
As required by comments, here is an example I found on the web:
isPrime a = isPrimeHelper a primes isPrimeHelper a (p:ps) | p*p > a = True | a `mod` p == 0 = False | otherwise = isPrimeHelper a ps primes = 2 : filter isPrime [3,5..]
I was expecting, before running it, to be quite slow, since it keeps accessing elements of primes
without explicitly caching them (thus, unless these values are cached somewhere, they would need to be recomputed plenty times). But I was wrong.
If I set +s
in GHCI (to print timing/memory stats after each evaluation) and evaluate the expression primes!!10000
twice, this is what I get:
*Main> :set +s *Main> primes!!10000 104743 (2.10 secs, 169800904 bytes) *Main> primes!!10000 104743 (0.00 secs, 0 bytes)
This means that at least primes !! 10000
(or better: the whole primes
list, since also primes!!9999
will take no time) must be cached.
primes
, in your code, is not a function, but a constant, in haskellspeak known as a CAF. If it took a parameter (say, ()
), you would get two different versions of the same list back if calling it twice, but as it is a CAF, you get the exact same list back both times;
As a ghci top-level definition, primes
never becomes unreachable, thus the head of the list it points to (and thus its tail/the rest of the computation) is never garbage collected. Adding a parameter prevents retaining that reference, the list would then be garbage collected as (!!)
iterates down it to find the right element, and your second call to (!!)
would force repetition of the whole computation instead of just traversing the already-computed list.
Note that in compiled programs, there is no top-level scope like in ghci and things get garbage collected when the last reference to them is gone, quite likely before the whole program exits, CAF or not, meaning that your first call would take long, the second one not, and after that, "the future of your program" not referencing the CAF anymore, the memory the CAF takes up is recycled.
The primes package provides a function that takes an argument for (primarily, I'd claim) this very reason, as carrying around half a terabyte of prime numbers might not be what one wants to do.
If you want to really get down to the bottom of this, I recommend reading the STG paper. It doesn't include newer developments in GHC, but does a great job of explaining how Haskell maps onto assembly, and thus how thunks get eaten by strictness, in general.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With