I would like to implement term graphs in Haskell, so that I can implement a term rewriting engine that uses sharing. Something like
data TG f v = Var v | Op f [TG f v] | P (Ptr (TG f v))
And I would want something like the following to make sense:
let
t' = Op 'f' [Var 'x', Var 'y']
t = getPointer t'
in
Op 'g' [P t,P t]
Then during rewriting, I only have to rewrite t once.
However, I noticed two things: (1) the module is called Foreign.Storable, so should it only be used for FFI stuff and (2) there are no instances of Foreign.Storable for any types like lists; why is this?
As pointed out in the comments, if you want to define a normal algebraic datatype in Haskell but gain access to the graph structure, you need to use some variant of observable sharing. Types like ForeignPtr
are really for interfacing with external code or low-level memory management and aren't really appropriate for this kind of situation.
All the available techniques for observable sharing require some kind of slightly "unsafe" code - in that the burden is on the user not to misuse it. The issue is that Haskell's semantics aren't intended to allow you to "see" whether two values are the same pointer or not. However in practice the worst that can happen is that you will miss some situation where the user used a single definition, so you will end up with duplication in your internal data structure. Depending on the semantics of your own structure, this may just have a performance impact.
Observable sharing is usually based on lower level primitives for pointer equality - i.e. checking whether two specified Haskell values are actually being stored at exactly the same location in memory, or the more versatile stable names, which represent the location in memory of a single Haskell value and can be stored in a table and compared for equality later on.
Higher level libraries like data-reify
help to hide these details from you.
The nicest way to use observable sharing is to allow users to write normal values of the algebraic types, e.g. for your example simply:
let t = Op 'f' [Var 'x', Var 'y']
in Op 'g' [P t,P t]
and then have your library use whichever approach to observable sharing to translate that into some kind of explicit graph structure as soon as you receive the values from the user. For example you might translate into a different datatype with explicit pointers, or augment the TG
type with them. The explicit pointers would just be some kind of lookup into your own map structure, e.g.
data InternalTG f v = ... | Pointer Int
type TGMap f v = IntMap (InternalTG f v)
If using something like data-reify
then InternalTG f v
would be the DeRef
type for TG f v
.
You can then do your rewriting on the resulting graph structure.
As an alternative to using observable sharing at all, if you are willing for your users to use a monad to construct their values and explicitly choose when to use sharing (as suggested by the inclusion of getPointer
above), then you can simply use a state monad to build up the graph explicitly:
-- your code
data TGState f v = TGState { tgMap :: IntMap (TG f v), tgNextSymbol :: Int }
initialTGState :: TGState f v
initialTGState = TGState { tgMap = IntMap.empty, tgNextSymbol = 0 }
type TGMonad f v a = State (TGState f v) a
newtype Ptr tg = Ptr Int -- a "phantom type" just to give some type safety
getPointer :: TG f v -> TGMonad f v (Ptr (TG f v))
getPointer tg = do
tgState <- get
let sym = tgNextSymbol tgState
put $
TGState { tgMap = IntMap.insert sym tg (tgMap tgState),
tgNextSymbol = sym + 1 }
return (Ptr sym)
runTGMonad :: TGMonad a -> (a, IntMap (TG f v))
runTGMonad m =
let (v, tgState) = runState m
(v, tgMap tgState)
-- user code
do
let t' = Op 'f' [Var 'x', Var 'y']
t <- getPointer t'
return $ Op 'g' [P t,P t]
Once you have the graph by whatever route, there are all sorts of techniques for manipulating it, but these are probably beyond the scope of your original question.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With