Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How efficient is the derived Eq instance in GHC?

Is there a short circuit built in to GHC's (and Haskell's in general) derived Eq instance that will fire when I compare the same instance of a data type?

-- will this fire?
let same = complex == complex

My plan is to read in a lazy datastructure (let's say a tree), change some values and then compare the old and the new version to create a diff that will then be written back to the file.

If there would be a short circuit built in then the compare step would break as soon as it finds that the new structure is referencing old values. At the same time this wouldn't read in more than necessary from the file in the first place.

I know I'm not supposed to worry about references in Haskell but this seems to be a nice way to handle lazy file changes. If there is no shortcircuit builtin, would there be a way to implement this? Suggestions on different schemes welcome.

like image 524
fho Avatar asked Oct 09 '12 10:10

fho


2 Answers

StableNames are specifically designed to solve problems like yours.

Note that StableNames can only be created in the IO monad. So you have two choices: either create your objects in the IO monad, or use unsafePerformIO in your (==) implementation (which is more or less fine in this situation).

But I should stress that it is possible to do this in a totally safe way (without unsafe* functions): only creation of stable names should happen in IO; after that, you may compare them in a totally pure way.

E.g.

data SNWrapper a = SNW !a !(StableName a)

snwrap :: a -> IO (SNWrapper a)
snwrap a = SNW a <$> makeStableName a

instance Eq a => Eq (SNWrapper a) where
  (SNW a sna) (SNW b snb) = sna == snb || a == b

Notice that if stable name comparison says "no", you still need to perform full value comparison to get a definitive answer.

In my experience that worked pretty well when you have lots of sharing and for some reason are not willing to use other methods to indicate sharing.

(Speaking of other methods, you could, for example, replace the IO monad with a State Integer monad and generate unique integers in that monad as an equivalent of "stable names".)

Another trick is, if you have a recursive data structure, make the recursion go through SNWrapper. E.g. instead of

data Tree a = Bin (Tree a) (Tree a) | Leaf a
type WrappedTree a = SNWrapper (Tree a)

use

data Tree a = Bin (WrappedTree a) (WrappedTree a) | Leaf a
type WrappedTree a = SNWrapper (Tree a)

This way, even if short-circuiting doesn't fire at the topmost layer, it might fire somewhere in the middle and still save you some work.

like image 179
Roman Cheplyaka Avatar answered Nov 18 '22 07:11

Roman Cheplyaka


There's no short-circuiting when both arguments of (==) are the same object. The derived Eq instance will do a structural comparison, and in the case of equality, of course needs to traverse the entire structure. You can build in a possible shortcut yourself using

GHC.Prim.reallyUnsafePtrEquality# :: a -> a -> GHC.Prim.Int#

but that will in fact fire only rarely:

Prelude GHC.Base> let x = "foo"
Prelude GHC.Base> I# (reallyUnsafePtrEquality# x x)
1
Prelude GHC.Base> I# (reallyUnsafePtrEquality# True True)
1
Prelude GHC.Base> I# (reallyUnsafePtrEquality# 3 3)
0
Prelude GHC.Base> I# (reallyUnsafePtrEquality# (3 :: Int) 3)
0

And if you read a structure from file, it will certainly not find it the same object as one that was already in memory.

You can use rewrite rules to avoid the comparison of lexically identical objects

module Equal where

{-# RULES
"==/same"  forall x. x == x = True
  #-}

main :: IO ()
main = let x = [1 :: Int .. 10] in print (x == x)

which leads to

$ ghc -O -ddump-rule-firings Equal.hs 
[1 of 1] Compiling Equal            ( Equal.hs, Equal.o )
Rule fired: Class op enumFromTo
Rule fired: ==/same
Rule fired: Class op show

the rule firing (note: it didn't fire with let x = "foo", but with user-defined types, it should).

like image 4
Daniel Fischer Avatar answered Nov 18 '22 07:11

Daniel Fischer