I'm trying to recover sharing (in the Type-Safe Observable Sharing in Haskell sense) for a simple AST, using Data.Reify
:
{-# LANGUAGE DeriveFoldable, DeriveFunctor, DeriveTraversable, TypeFamilies #-}
module Sharing where
import Data.Foldable
import Data.Reify
import Data.Traversable
-- Original AST, without sharing. Expressed as a functor for ease of
-- use with Data.Reify.
data AstF f =
LitF Int
| AddF f f
deriving (Foldable, Functor, Show, Traversable)
newtype Fix f = In { out :: f (Fix f) }
instance Traversable a => MuRef (Fix a) where
type DeRef (Fix a) = a
mapDeRef f = traverse f . out
type Ast' = Fix AstF
-- Final AST, with explicit sharing.
data Ast =
Var Name
| Let Ast Ast
| Lit Int
| Add Ast Ast
deriving Show
type Name = Int -- de Bruijn index
-- Recover sharing and introduce Lets/Vars.
recoverSharing :: Ast' -> IO Ast
recoverSharing e = introduceLets `fmap` reifyGraph e
where
introduceLets :: Graph (DeRef Ast') -> Ast
introduceLets = undefined -- ???
I have the feeling that implementing introduceLets
(which should introduce both Let
s and Var
s) ought to simple and short, but I don't have enough experience with de Bruijn indices to know if there's a standard way to do it. How would you convert the Graph
representation into the Ast
representation?
P.S. Note that this is a quite degenerate case, as Ast'
doesn't actually have a binding constructor of its own; all bindings come from the sharing recovery.
P.P.S. Ideally we wouldn't introduce Let
s for single use expressions (although if we do we can remove them using an inlining pass.)
We'll divide this problem into 3 parts. The first part is to use the data-reify library to recover the graph of the AstF
. The second part will create an abstract syntax tree with Let
bindings represented with de Bruijn indices. Finally, we will remove all of the unnecessary let bindings.
These are all the toys we will use along the way. StandaloneDeriving
and UndecidableInstances
are only needed to provide Eq
and Show
instances for things like Fix
.
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE DeriveTraversable #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE UndecidableInstances #-}
import Data.Foldable
import Data.Reify
import Data.Traversable
import qualified Data.List as List
import Data.IntMap ((!))
import qualified Data.IntMap as IntMap
import Prelude hiding (any)
You have almost all of the pieces in place to use the data-reify library.
data AstF f =
LitF Int
| AddF f f
deriving (Eq, Show, Functor, Foldable, Traversable)
newtype Fix f = In { out :: f (Fix f) }
deriving instance Eq (f (Fix f)) => Eq (Fix f)
deriving instance Show (f (Fix f)) => Show (Fix f)
instance Traversable a => MuRef (Fix a) where
type DeRef (Fix a) = a
mapDeRef f = traverse f . out
All that's missing is the call to reifyGraph
. Let's try a small example
do
let example = In (AddF (In (AddF (In (LitF 1)) (In (LitF 2)))) example)
graph <- reifyGraph example
print graph
This outputs
let [(1,AddF 2 1),(2,AddF 3 4),(4,LitF 2),(3,LitF 1)] in 1
graph
has the type Graph AstF
, and is constructed by the constructor Graph [(Unique, AstF Unique)] Unique
. The first argument to the constructor is the list of nodes with their new unique keys. Each edge in the structure has been replaced with the new unique key of the node at the edge's head. The second argument to the constructor is the unique key of the node of the root of the tree.
We will convert the Graph
from data-reify into a de Bruijn indexed abstract syntax tree with Let
bindings. We will represent the AST using the following type. This type doesn't need to know anything about the internal representation of the AST.
type Index = Int
-- This can be rewritten in terms of Fix and Functor composition
data Indexed f
= Var Index
| Let (Indexed f) (Indexed f)
| Exp (f (Indexed f))
deriving instance Eq (f (Indexed f)) => Eq (Indexed f)
deriving instance Show (f (Indexed f)) => Show (Indexed f)
The Index
es represent the number of Let
s between where the variable is used and the Let
where it was declared. You should read Let a b
as let (Var 0)=a in b
Our strategy to convert the graph into an Indexed
AST is to traverse the graph starting at the root node. At every node, we will introduce a Let
binding for that node. For every edge we will check to see if the node it refers to is already in an introduced Let
binding that is in scope. If it is, we will replace the edge with the variable for that Let
binding. If it is not already introduced by a Let
binding, we will traverse it. The only thing we need to know about the AST we are operating on is that it is a Functor
.
index :: Functor f => Graph (DeRef (Fix f)) -> Indexed f
index (Graph edges root) = go [root]
where
go keys@(key:_) =
Let (Exp (fmap lookup (map ! key))) (Var 0)
where
lookup unique =
case List.elemIndex unique keys of
Just n -> Var n
Nothing -> go (unique:keys)
map = IntMap.fromList edges
We will define the following for convenience.
reifyLet :: Traversable f => Fix f -> IO (Indexed f)
reifyLet = fmap index . reifyGraph
We'll try the same example as before
do
let example = In (AddF (In (AddF (In (LitF 1)) (In (LitF 2)))) example)
lets <- reifyLet example
print lets
This outputs
Let (Exp (AddF (Let (Exp (AddF (Let (Exp (LitF 1)) (Var 0)) (Let (Exp (LitF 2)) (Var 0)))) (Var 0)) (Var 0))) (Var 0)
We only had 1 let
binding in example
but this has 4 Let
s. We will remove the unnecessary Let
binding in the next step.
To remove Let
bindings that introduce unused variables, we need a notion of what a used variable is. We will define it for any Foldable
AST.
used :: (Foldable f) => Index -> Indexed f -> Bool
used x (Var y) = x == y
used x (Let a b) = used (x+1) a || used (x+1) b
used x (Exp a) = any (used x) a
When we remove a Let
bindings, the number of intervening Let
bindings, and thus the de Bruijn indices for variables, will change. We will need to be able to remove a variable from an Indexed
AST
remove x :: (Functor f) => Index -> Indexed f -> Indexed f
remove x (Var y) =
case y `compare` x of
EQ -> error "Removed variable that's being used`
LT -> Var y
GT -> Var (y-1)
remove x (Let a b) = Let (remove (x+1) a) (remove (x+1) b)
remove x (Exp a) = Exp (fmap (remove x) a)
There are two ways a Let
binding can introduce an unused variable. The variable can be completely unused, for example let a = 1 in 2
, or it can be trivially used, as in let a = 1 in a
. The first can be replaced by 2
and the second can be replaced by 1
. When we remove the Let
binding, we also need to adjust all of the remaining variables in the AST with remove
. Things that aren't Let
don't introduce unused variables, and have nothing to replace.
removeUnusedLet :: (Functor f, Foldable f) => Indexed f -> Indexed f
removeUnusedLet (Let a b) =
if (used 0 b)
then
case b of
Var 0 ->
if (used 0 a)
then (Let a b)
else remove 0 a
_ -> (Let a b)
else remove 0 b
removeUnusedLet x = x
We'd like to be able to apply removeUnusedLet
everywhere in the Indexed
AST. We could use something more generic for this, but we'll just define for ourselves how to apply a function everywhere in an Indexed
AST
mapIndexed :: (Functor f) => (Indexed f -> Indexed f) -> Indexed f -> Indexed f
mapIndexed f (Let a b) = Let (f a) (f b)
mapIndexed f (Exp a) = Exp (fmap f a)
mapIndexed f x = x
postMap :: (Functor f) => (Indexed f -> Indexed f) -> Indexed f -> Indexed f
postMap f = go
where
go = f . mapIndexed go
Then we can remove all the unused lets with
removeUnusedLets = postMap removeUnusedLet
We'll try our example again
do
let example = In (AddF (In (AddF (In (LitF 1)) (In (LitF 2)))) example)
lets <- reifyLet example
let simplified = removeUnusedLets lets
print simplified
This introduces only a single Let
Let (Exp (AddF (Exp (AddF (Exp (LitF 1)) (Exp (LitF 2)))) (Var 0))) (Var 0)
Mutually recursive definitions don't result in mutually recursive Let
bindings. For example
do
let
left = In (AddF (In (LitF 1)) right )
right = In (AddF left (In (LitF 2)))
example = In (AddF left right )
lets <- reifyLet example
let simplified = removeUnusedLets lets
print simplified
Results in
Exp (AddF
(Let (Exp (AddF
(Exp (LitF 1))
(Exp (AddF (Var 0) (Exp (LitF 2))))
)) (Var 0))
(Let (Exp (AddF
(Exp (AddF (Exp (LitF 1)) (Var 0)))
(Exp (LitF 2))
)) (Var 0)))
I don't believe there is a mutually recursive representation for these in Indexed
without using a negative Index
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With