I'm using GHC 7.4 to compile the following function:
nodups' :: [Int] -> Bool
nodups' = ok empty
where ok _ [] = True
ok seen (n:ns) = not (n `member` seen) && ok (n `insert` seen) ns
member n word = testBit word n
insert n word = setBit word n
empty = 0 :: Int
The function looks for duplicate elements in a list of small integers. The set seen
is a representation of a set of small integers as a bit vector. The profiler (run with ghc -prof -auto-all
) claims that the ok
function accounts for 22% of allocation overall. Looking at the output with -ddump-simpl
, I can't understand why this code is allocating. I checked, and as far as I can tell it is not allocating a thunk for the call to insert
.
What should I look at to identify the part of my code that is allocating?
I know of simple (scientific) implementations of functional languages, and if I remember correctly there is the G-Machine that may be used with Haskell.
This means (again, if I remember correctly) that your program state is represented like a "Tree", where the nodes are (for the sake of simplicity here) the functions you use in your code. The leafes would be the arguments to it. The "G-Maschine" then looks along the "Spine" (the left-side chain of nodes) and looks in the set of available "Functions" ("Supercombinators"?) for a pattern-match that it can apply. If a mattern-match is recognized from the left side of a definition it is then replaced by the right side of the definition.
This means that even a simple line like
ok seen (n:ns) = not (n `member` seen) && ok (n `insert` seen) ns
or even
(n:ns) = ns
is doing something in computer memory, i.e. matching the pattern
...
...
(:)
/ \
n ns
and replacing it with
...
...
ns
The final result might consume less memory then the input, but this is a dynamic step and therefore must take place somewhere. If this is repeated over and over again (in a "tight loop") then this will make you CPU busy, as well it will your memory -- just because the G-Machine is operating. (As I said, I am not sure the G-Machine-concept applies here, but I guess it is something similar).
member n word = testBit word n
insert n word = setBit word n
Besides that I habe some suspicions. testBit
and setBit
look like index operations on lists. If they are it could take some work. If they are proper arrays it would be ok. If they are a sort of maps or sets... well... there might be costly hashing involved? Or implemented via a balanced tree, which uses lots of (costly?) comparision operations?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With