Assume we'd like to build a type that represents operations typical for, let's say, a lock-free algorithm:
newtype IntPtr = IntPtr { ptr :: Int } deriving (Eq, Ord, Show)
data Op r where
OpRead :: IntPtr -> Op Int
OpWrite :: IntPtr -> Int -> Op ()
OpCAS :: IntPtr -> Int -> Int -> Op Bool
Ideally, we'd like to represent some algorithms within this model using a convenient do
-notation, like (assuming corresponding read = OpRead
and cas = OpCAS
for aesthetic reasons) the following almost literal translation of the Wikipedia example:
import Prelude hiding (read)
import Control.Monad.Loops
add :: IntPtr -> Int -> Op Int
add p a = snd <$> do
iterateUntil fst $ do
value <- read p
success <- cas p value (value + a)
pure (success, value + a)
How could we achieve that? Let's add a couple more constructors to Op
to represent pure injected values and the monadic bind:
OpPure :: a -> Op a
OpBind :: Op a -> (a -> Op b) -> Op b
So let's try to write a Functor
instance. OpPure
and OpBind
is easy, being, for instance:
instance Functor Op where
fmap f (OpPure x) = OpPure (f x)
But the constructors that specify the GADT type start smelling bad:
fmap f (OpRead ptr) = do
val <- OpRead ptr
pure $ f val
Here we assume we'll write the Monad
instance later on anyway to avoid ugly nested OpBind
s.
Is this the right way to handle such types, or is my design just terribly wrong, this being a sign of it?
This style of using do
-notation to build a syntax tree that'll be interpreted later is modelled by the free monad. (I'm actually going to demonstrate what's known as the freer or operational monad, because it's closer to what you have so far.)
Your original Op
datatype - without OpPure
and OpBind
- represents a set of atomic typed instructions (namely read
, write
and cas
). In an imperative language a program is basically a list of instructions, so let's design a datatype which represents a list of Op
s.
One idea might be to use an actual list, ie type Program r = [Op r]
. Clearly that won't do as it constrains every instruction in the program to have the same return type, which would not make for a very useful programming language.
The key insight is that in any reasonable operational semantics of an interpreted imperative language, control flow doesn't proceed past an instruction until the interpreter has computed a return value for that instruction. That is, the nth instruction of a program depends in general on the results of instructions 0 to n-1. We can model this using continuation passing style.
data Program a where
Return :: a -> Program a
Step :: Op r -> (r -> Program a) -> Program a
A Program
is a kind of list of instructions: it's either an empty program which returns a single value, or it's a single instruction followed by a list of instructions. The function inside the Step
constructor means that the interpreter running the Program
has to come up with an r
value before it can resume interpreting the rest of the program. So sequentiality is ensured by the type.
To build your atomic programs read
, write
and cas
, you need to put them in a singleton list. This involves putting the relevant instruction in the Step
constructor, and passing a no-op continuation.
lift :: Op a -> Program a
lift i = Step i Return
read ptr = lift (OpRead ptr)
write ptr val = lift (OpWrite ptr val)
cas ptr cmp val = lift (OpCas ptr cmp val)
Program
differs from your tweaked Op
in that at each Step
there's only ever one instruction. OpBind
's left argument was potentially a whole tree of Op
s. This would've allowed you to distinguish differently-associated >>=
s, breaking the monad associativity law.
You can make Program
a monad.
instance Monad Program where
return = Return
Return x >>= f = f x
Step i k >>= f = Step i ((>>= f) . k)
>>=
basically performs list concatenation - it walks to the end of the list (by composing recursive calls to itself under the Step
continuations) and grafts on a new tail. This makes sense - it corresponds to the intutitive "run this program, then run that program" semantics of >>=
.
Noting that Program
's Monad
instance doesn't depend on Op
, an obvious generalisation is to parameterise the type of instruction and make Program
into a list of any old instruction set.
data Program i a where
Return :: a -> Program i a
Step :: i r -> (r -> Program i a) -> Program a
instance Monad (Program i) where
-- implementation is exactly the same
So Program i
is a monad for free, no matter what i
is. This version of Program
is a rather general tool for modelling imperative languages.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With