I'm writing an interpreter for a small language.
This language supports mutation, so its evaluator keeps track of a Store
for all the variables (where type Store = Map.Map Address Value
, type Address = Int
, and data Value
is a language-specific ADT).
It's also possible for computations to fail (e.g., dividing by zero), so the result has to be an Either String Value
.
The type of my interpreter, then, is
eval :: Environment -> Expression -> State Store (Either String Value)
where type Environment = Map.Map Identifier Address
keeps track of local bindings.
For example, interpreting a constant literal doesn't need to touch the store, and the result always succeeds, so
eval _ (LiteralExpression v) = return $ Right v
But when we apply a binary operator, we do need to consider the store.
For example, if the user evaluates (+ (x <- (+ x 1)) (x <- (+ x 1)))
and x
is initially 0
, then the final result should be 3
, and x
should be 2
in the resulting store.
This leads to the case
eval env (BinaryOperator op l r) = do
lval <- eval env l
rval <- eval env r
return $ join $ liftM2 (applyBinop op) lval rval
Note that the do
-notation is working within the State Store
monad.
Furthermore, the use of return
is monomorphic in State Store
, while the uses of join
and liftM2
are monomorphic in the Either String
monad.
That is, here we use
(return . join) :: Either String (Either String Value) -> State Store (Either String Value)
and return . join
is not a no-op.
(As is evident, applyBinop :: Identifier -> Value -> Value -> Either String Value
.)
This seems confusing at best, and this is a relatively simple case. The case of function application, for example, is considerably more complicated.
What useful best practices should I know about to keep my code readable—and writable?
EDIT: Here's a more typical example, which better showcases the ugliness.
The NewArrayC
variant has parameters length :: Expression
and element :: Expression
(it creates an array of a given length with all elements initialized to a constant).
A simple example is (newArray 3 "foo")
, which yields ["foo", "foo", "foo"]
, but we could also write (newArray (+ 1 2) (concat "fo" "oo"))
, because we can have arbitrary expressions in a NewArrayC
.
But when we actually call
allocateMany :: Int -> Value -> State Store Address,
which takes the number of elements to allocate and the value for each slot, and returns the starting address, we need to unpack those values.
In the logic below, you can see that I'm duplicating a bunch of logic that should be built-in to the Either
monad.
All the case
s should just be binds.
eval env (NewArrayC len el) = do
lenVal <- eval env len
elVal <- eval env el
case lenVal of
Right (NumV lenNum) -> case elVal of
Right val -> do
addr <- allocateMany lenNum val
return $ Right $ ArrayV addr lenNum -- result data type
left -> return left
Right _ -> return $ Left "expected number in new-array length"
left -> return left
This is what monad transformers are for. There is a StateT
transformer to add state to a stack, and an EitherT
transformer to add Either
-like failure to a stack; however, I prefer ExceptT
(which adds Except
-like failure), so I will give my discussion in terms of that. Since you want the stateful bit outermost, you should use ExceptT e (State s)
as your monad.
type DSL = ExceptT String (State Store)
Note that the stateful operations can be spelled get
and put
, and these are polymorphic over all instances of MonadState
; so that in particular they will work okay in our DSL
monad. Similarly, the canonical way to raise an error is throwError
, which is polymorphic over all instances of MonadError String
; and in particular will work okay in our DSL
monad.
So now we would write
eval :: Environment -> Expression -> DSL Value
eval _ (Literal v) = return v
eval e (Binary op l r) = liftM2 (applyBinop op) (eval e l) (eval e r)
You might also consider giving eval
a more polymorphic type; it could return an (MonadError String m, MonadState Store m) => m Value
instead of a DSL Value
. In fact, for allocateMany
, it's important that you give it a polymorphic type:
allocateMany :: MonadState Store m => Int -> Value -> m Address
There's two pieces of interest about this type: first, because it is polymorphic over all MonadState Store m
instances, you can be just as sure that it only has stateful side effects as if it had the type Int -> Value -> State Store Address
that you suggested. However, also because it is polymorphic, it can be specialized to return a DSL Address
, so it can be used in (for example) eval
. Your example eval
code becomes this:
eval env (NewArrayC len el) = do
lenVal <- eval env len
elVal <- eval env el
case lenVal of
NumV lenNum -> allocateMany lenNum elVal
_ -> throwError "expected number in new-array length"
I think that's quite readable, really; nothing too extraneous there.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With