Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is evaluating keys to WHNFs enough to construct Sets in Haskell?

Haskell's Data.Set page says:

Key arguments are evaluated to WHNF

in its "Strictness properties" section.

However, I wonder why WHNF is enough to construct Sets. For example, to construct an instance of Set [Int], we must evaluate its elements of Lists of Ints deeply to compare them.

like image 972
raviqqe Avatar asked Jan 19 '17 14:01

raviqqe


2 Answers

To expand on @chi’s comment: This only means that keys are at least evaluated to WHNF. Often, they might be evaluated always once they are compared, but not always. Let’s invoke ghc-heap-view and look at a few examples:

Prelimaries:

~ $ ghci
GHCi, version 8.0.1: http://www.haskell.org/ghc/  :? for help
Prelude> :script /home/jojo/.cabal/share/x86_64-linux-ghc-8.0.1/ghc-heap-view-0.5.7/ghci
Prelude> import qualified Data.Set as S

singleton does not fully evaluate its argument (the _bco is a thunk):

Prelude S> let t = [True, True && True] -- a thunk in a list
Prelude S> let s1 = S.singleton t
Prelude S> s1 `seq` ()
()
Prelude S> :printHeap s1
let x1 = True()
    x2 = Tip()
in Bin [x1,(_bco _fun)()] x2 x2 1

And even inserting another element will not fully evaluate the second elements in the list, as they can be distinguished already by looking at the first element:

Prelude S> let t2 = [False, False && False] -- a thunk in another  list
Prelude S> let s2 = S.insert t2 s1
Prelude S> s2 `seq` ()
()
Prelude S> :printHeap s2
let x1 = True()
    x2 = toArray (0 words) 
    f1 = _fun
    x3 = []
    x4 = False()
    x5 = Tip()
in Bin (x1 : (_bco f1)() : x3) (Bin (x4 : (_bco f1)() : x3) x5 x5 1) x5 2

But inserting t2 again will now force the second element of that list:

Prelude S> let s3 = S.insert t2 s2
Prelude S> s3 `seq` ()
()
Prelude S> :printHeap s3
let x1 = True()
    x2 = []
    x3 = False()
    x4 = Tip()
in Bin (x1 : (_bco _fun)() : x2) (Bin (x3 : _bh x3 : x2) x4 x4 1) x4 2

So you cannot rely on Data.Set to evaluate the keys fully as you store them. If you want that, you need to use, for example, (singleton $!! t1) and (insert $!! t2).

(If someone wants to replace the ghc-heap-view output in this answer with ghc-vis graphs, feel free to do so :-)).

like image 195
Joachim Breitner Avatar answered Sep 28 '22 02:09

Joachim Breitner


There may be data types usable as keys that don't need to evaluate everything they contain to determine equality or order. Lists are no such types, of course.

However, while lists must be fully evaluated to find equality, they don't necessary need to in order to find non-equality. That is, we wouldn't expect that

[1..] == [2..]

computes forever. Likewise with

[] == [(40+2)..]

Here it is enough to get WHNF of the second list to find that it is not equal to the first one. We need not bother to compute 40+2, much less the succeeding elements.

like image 28
Ingo Avatar answered Sep 28 '22 02:09

Ingo