I'm trying to use CPS to simplify control-flow implementation in my Python interpreter. Specifically, when implementing return
/break
/continue
, I have to store state and unwind manually, which is tedious. I've read that it's extraordinarily tricky to implement exception handling in this way. What I want is for each eval
function to be able to direct control flow to either the next instruction, or to a different instruction entirely.
Some people with more experience than me suggested looking into CPS as a way to deal with this properly. I really like how it simplifies control flow in the interpreter, but I'm not sure how much I need to actually do in order to accomplish this.
Do I need to run a CPS transform on the AST? Should I lower this AST into a lower-level IR that is smaller and then transform that?
Do I need to update the evaluator to accept the success continuation everywhere? (I'm assuming so).
I think I generally understand the CPS transform: the goal is to thread the continuation through the entire AST, including all expressions.
I'm also a bit confused where the Cont
monad fits in here, as the host language is Haskell.
Edit: here's a condensed version of the AST in question. It is a 1-1 mapping of Python statements, expressions, and built-in values.
data Statement
= Assignment Expression Expression
| Expression Expression
| Break
| While Expression [Statement]
data Expression
| Attribute Expression String
| Constant Value
data Value
= String String
| Int Integer
| None
To evaluate statements, I use eval
:
eval (Assignment (Variable var) expr) = do
value <- evalExpr expr
updateSymbol var value
eval (Expression e) = do
_ <- evalExpr e
return ()
To evaluate expressions, I use evalExpr
:
evalExpr (Attribute target name) = do
receiver <- evalExpr target
attribute <- getAttr name receiver
case attribute of
Just v -> return v
Nothing -> fail $ "No attribute " ++ name
evalExpr (Constant c) = return c
What motivated the whole thing was the shenanigans required for implementing break. The break definition is reasonable, but what it does to the while definition is a bit much:
eval (Break) = do
env <- get
when (loopLevel env <= 0) (fail "Can only break in a loop!")
put env { flow = Breaking }
eval (While condition block) = do
setup
loop
cleanup
where
setup = do
env <- get
let level = loopLevel env
put env { loopLevel = level + 1 }
loop = do
env <- get
result <- evalExpr condition
when (isTruthy result && flow env == Next) $ do
evalBlock block
-- Pretty ugly! Eat continue.
updatedEnv <- get
when (flow updatedEnv == Continuing) $ put updatedEnv { flow = Next }
loop
cleanup = do
env <- get
let level = loopLevel env
put env { loopLevel = level - 1 }
case flow env of
Breaking -> put env { flow = Next }
Continuing -> put env { flow = Next }
_ -> return ()
I am sure there are more simplifications that can be done here, but the core problem is one of stuffing state somewhere and manually winding out. I'm hoping that CPS will let me stuff book-keeping (like loop exit points) into state and just use those when I need them.
I dislike the split between statements and expressions and worry it might make the CPS transform more work.
This finally gave me a good excuse to try using ContT
!
Here's one possible way of doing this: store (in a Reader
wrapped in ContT
) the continuation of exiting the current (innermost) loop:
newtype M r a = M{ unM :: ContT r (ReaderT (M r ()) (StateT (Map Id Value) IO)) a }
deriving ( Functor, Applicative, Monad
, MonadReader (M r ()), MonadCont, MonadState (Map Id Value)
, MonadIO
)
runM :: M a a -> IO a
runM m = evalStateT (runReaderT (runContT (unM m) return) (error "not in a loop")) M.empty
withBreakHere :: M r () -> M r ()
withBreakHere act = callCC $ \break -> local (const $ break ()) act
break :: M r ()
break = join ask
(I've also added IO
for easy printing in my toy interpreter, and State (Map Id Value)
for variables).
Using this setup, you can write Break
and While
as:
eval Break = break
eval (While condition block) = withBreakHere $ fix $ \loop -> do
result <- evalExpr condition
unless (isTruthy result)
break
evalBlock block
loop
Here's the full code for reference:
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
module Interp where
import Prelude hiding (break)
import Control.Applicative
import Control.Monad.Cont
import Control.Monad.State
import Control.Monad.Reader
import Data.Function
import Data.Map (Map)
import qualified Data.Map as M
import Data.Maybe
type Id = String
data Statement
= Print Expression
| Assign Id Expression
| Break
| While Expression [Statement]
| If Expression [Statement]
deriving Show
data Expression
= Var Id
| Constant Value
| Add Expression Expression
| Not Expression
deriving Show
data Value
= String String
| Int Integer
| None
deriving Show
data Env = Env{ loopLevel :: Int
, flow :: Flow
}
data Flow
= Breaking
| Continuing
| Next
deriving Eq
newtype M r a = M{ unM :: ContT r (ReaderT (M r ()) (StateT (Map Id Value) IO)) a }
deriving ( Functor, Applicative, Monad
, MonadReader (M r ()), MonadCont, MonadState (Map Id Value)
, MonadIO
)
runM :: M a a -> IO a
runM m = evalStateT (runReaderT (runContT (unM m) return) (error "not in a loop")) M.empty
withBreakHere :: M r () -> M r ()
withBreakHere act = callCC $ \break -> local (const $ break ()) act
break :: M r ()
break = join ask
evalExpr :: Expression -> M r Value
evalExpr (Constant val) = return val
evalExpr (Var v) = gets $ fromMaybe err . M.lookup v
where
err = error $ unwords ["Variable not in scope:", show v]
evalExpr (Add e1 e2) = do
Int val1 <- evalExpr e1
Int val2 <- evalExpr e2
return $ Int $ val1 + val2
evalExpr (Not e) = do
val <- evalExpr e
return $ if isTruthy val then None else Int 1
isTruthy (String s) = not $ null s
isTruthy (Int n) = n /= 0
isTruthy None = False
evalBlock = mapM_ eval
eval :: Statement -> M r ()
eval (Assign v e) = do
val <- evalExpr e
modify $ M.insert v val
eval (Print e) = do
val <- evalExpr e
liftIO $ print val
eval (If cond block) = do
val <- evalExpr cond
when (isTruthy val) $
evalBlock block
eval Break = break
eval (While condition block) = withBreakHere $ fix $ \loop -> do
result <- evalExpr condition
unless (isTruthy result)
break
evalBlock block
loop
and here's a neat test example:
prog = [ Assign "i" $ Constant $ Int 10
, While (Var "i") [ Print (Var "i")
, Assign "i" (Add (Var "i") (Constant $ Int (-1)))
, Assign "j" $ Constant $ Int 10
, While (Var "j") [ Print (Var "j")
, Assign "j" (Add (Var "j") (Constant $ Int (-1)))
, If (Not (Add (Var "j") (Constant $ Int (-4)))) [ Break ]
]
]
, Print $ Constant $ String "Done"
]
which is
i = 10
while i:
print i
i = i - 1
j = 10
while j:
print j
j = j - 1
if j == 4:
break
so it will print
10 10 9 8 7 6 5
9 10 9 8 7 6 5
8 10 9 8 7 6 5
...
1 10 9 8 7 6 5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With