Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Modeling a domain as a GADT type and providing do-sugar for it

Tags:

haskell

gadt

Assume we'd like to build a type that represents operations typical for, let's say, a lock-free algorithm:

newtype IntPtr = IntPtr { ptr :: Int } deriving (Eq, Ord, Show)

data Op r where 
  OpRead :: IntPtr -> Op Int
  OpWrite :: IntPtr -> Int -> Op ()

  OpCAS :: IntPtr -> Int -> Int -> Op Bool

Ideally, we'd like to represent some algorithms within this model using a convenient do-notation, like (assuming corresponding read = OpRead and cas = OpCAS for aesthetic reasons) the following almost literal translation of the Wikipedia example:

import Prelude hiding (read)
import Control.Monad.Loops

add :: IntPtr -> Int -> Op Int
add p a = snd <$> do
  iterateUntil fst $ do
    value <- read p
    success <- cas p value (value + a)
    pure (success, value + a)

How could we achieve that? Let's add a couple more constructors to Op to represent pure injected values and the monadic bind:

  OpPure :: a -> Op a
  OpBind :: Op a -> (a -> Op b) -> Op b

So let's try to write a Functor instance. OpPure and OpBind is easy, being, for instance:

instance Functor Op where
  fmap f (OpPure x) = OpPure (f x)

But the constructors that specify the GADT type start smelling bad:

  fmap f (OpRead ptr) = do
    val <- OpRead ptr
    pure $ f val

Here we assume we'll write the Monad instance later on anyway to avoid ugly nested OpBinds.

Is this the right way to handle such types, or is my design just terribly wrong, this being a sign of it?

like image 252
0xd34df00d Avatar asked Dec 24 '22 07:12

0xd34df00d


1 Answers

This style of using do-notation to build a syntax tree that'll be interpreted later is modelled by the free monad. (I'm actually going to demonstrate what's known as the freer or operational monad, because it's closer to what you have so far.)

Your original Op datatype - without OpPure and OpBind - represents a set of atomic typed instructions (namely read, write and cas). In an imperative language a program is basically a list of instructions, so let's design a datatype which represents a list of Ops.

One idea might be to use an actual list, ie type Program r = [Op r]. Clearly that won't do as it constrains every instruction in the program to have the same return type, which would not make for a very useful programming language.

The key insight is that in any reasonable operational semantics of an interpreted imperative language, control flow doesn't proceed past an instruction until the interpreter has computed a return value for that instruction. That is, the nth instruction of a program depends in general on the results of instructions 0 to n-1. We can model this using continuation passing style.

data Program a where
    Return :: a -> Program a
    Step :: Op r -> (r -> Program a) -> Program a

A Program is a kind of list of instructions: it's either an empty program which returns a single value, or it's a single instruction followed by a list of instructions. The function inside the Step constructor means that the interpreter running the Program has to come up with an r value before it can resume interpreting the rest of the program. So sequentiality is ensured by the type.

To build your atomic programs read, write and cas, you need to put them in a singleton list. This involves putting the relevant instruction in the Step constructor, and passing a no-op continuation.

lift :: Op a -> Program a
lift i = Step i Return

read ptr = lift (OpRead ptr)
write ptr val = lift (OpWrite ptr val)
cas ptr cmp val = lift (OpCas ptr cmp val)

Program differs from your tweaked Op in that at each Step there's only ever one instruction. OpBind's left argument was potentially a whole tree of Ops. This would've allowed you to distinguish differently-associated >>=s, breaking the monad associativity law.

You can make Program a monad.

instance Monad Program where
    return = Return
    Return x >>= f = f x
    Step i k >>= f = Step i ((>>= f) . k)

>>= basically performs list concatenation - it walks to the end of the list (by composing recursive calls to itself under the Step continuations) and grafts on a new tail. This makes sense - it corresponds to the intutitive "run this program, then run that program" semantics of >>=.


Noting that Program's Monad instance doesn't depend on Op, an obvious generalisation is to parameterise the type of instruction and make Program into a list of any old instruction set.

data Program i a where
    Return :: a -> Program i a
    Step :: i r -> (r -> Program i a) -> Program a

instance Monad (Program i) where
    -- implementation is exactly the same

So Program i is a monad for free, no matter what i is. This version of Program is a rather general tool for modelling imperative languages.

like image 182
Benjamin Hodgson Avatar answered Jan 12 '23 01:01

Benjamin Hodgson