Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I efficiently process a recursive data structure with a high degree of sharing? [duplicate]

In Scheme, the primitive eq? tests whether its arguments are the same object. For example, in the following list

(define lst
  (let (x (list 'a 'b))
    (cons x x)))

The result of

(eq? (car x) (cdr x))

is true, and moreover it is true without having to peer into (car x) and (cdr x). This allows you to write efficient equality tests for data structures that have a lot of sharing.

Is the same thing ever possible in Haskell? For example, consider the following binary tree implementation

data Tree a = Tip | Bin a (Tree a) (Tree a)

left  (Bin _ l _) = l
right (Bin _ _ r) = r

mkTree n :: Int -> Tree Int
mkTree 0 = Tip
mkTree n = let t = mkTree (n-1) in Bin n t t

which has sharing at every level. If I create a tree with let tree = mkTree 30 and I want to see if left tree and right tree are equal, naively I have to traverse over a billion nodes to discover that they are the same tree, which should be obvious because of data sharing.

I don't expect there is a simple way to discover data sharing in Haskell, but I wondered what the typical approaches to dealing with issues like this are, when it would be good to detect sharing for efficiency purposes (or e.g. to detect cyclic data structures).

Are there unsafe primitives that can detect sharing? Is there a well-known way to build data structures with explicit pointers, so that you can compare pointer equality?

like image 562
Chris Taylor Avatar asked Nov 22 '22 11:11

Chris Taylor


2 Answers

There's lots of approaches.

  1. Generate unique IDs and stick everything in a finite map (e.g. IntMap).
  2. The refined version of the last choice is to make an explicit graph, e.g. using fgl.
  3. Use stable names.
  4. Use IORefs (see also), which have both Eq and Ord instances regardless of the contained type.
  5. There are libraries for observable sharing.
  6. As mentioned above, there is reallyUnsafePtrEquality# but you should understand what's really unsafe about it before you use it!

See also this answer about avoiding equality checks altogether.

like image 80
Daniel Wagner Avatar answered Feb 09 '23 00:02

Daniel Wagner


It is not possible in Haskell, the pure language.

But in its implementation in GHC, there are loopholes, such as

  • the use of reallyUnsafePtrEquality# or
  • introspection libraries like ghc-heap-view.

In any case, using this in regular code would be very unidiomatic; at most I could imagine that building a highly specialized library for something (memoizatoin, hash tables, whatever) that then provides a sane, pure API, might be acceptable.

like image 24
Joachim Breitner Avatar answered Feb 08 '23 23:02

Joachim Breitner