Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Writing an assembler in Haskell - mapM with state?

I'm writing a very simple two-pass assembler in Haskell and I've come across a scenario that I don't yet have the experience to solve. I think the solution is likely to involve monad transformers, which I don't really understand.

The assembler parses the assembly code into a list of Statements, which are either instructions or labels. Some Statements may refer to labels. The assembler needs to convert the Statements into Instructions, which involves eliminating the labels and substituting the label references with an appropriate value.

I have written the first pass of the assembler, which produces a [(String, Int)] representing a map from labels to addresses. I have also written the following function for translating a Statement into an Instruction:

stmtToInstruction :: Int -> [(String, Int)] -> Statement -> Either String [I.Instruction]
stmtToInstruction addr labels stmt = case stmt of
    ADD d s1 s2     -> Right [I.ADD d s1 s2]
    BEQL s1 s2 l    -> case do label <- find (\e -> fst e == l) labels
                               let labelAddr = snd label
                               let relativeAddr = I.ImmS $ fromIntegral (labelAddr - addr)
                               return (I.BEQ s1 s2 relativeAddr) of
                        Just i -> Right [i]
                        Nothing -> Left $ "Label " ++ l ++ " not defined"
    LABEL _         -> Right []

I've omitted several cases for brevity, but you can see all the possible results here:

  • ADD always succeeds and produces an instruction
  • BEQL can either succeed or fail, depending on whether a label is found
  • LABEL always succeeds, even though it produces no actual instructions

This works as expected. The problem I now have is writing this function:

replaceLabels :: [Statement] -> Either String [I.Instruction]

replaceLabels takes a list of statements, and runs stmtToInstruction on each one. The addr argument to stmtToInstruction must be the length of the [Instruction] accumulated so far. The output may either be a Left String, if one of the label references was invalid, or a Right [I.Instruction], if there were no errors.

mapM :: Monad m => (a -> m b) -> [a] -> m [b] gets us some of the way there, but provides no way to inject the current address into the (a -> m b) function. How do I make this work?

like image 905
rossng Avatar asked Nov 18 '17 23:11

rossng


2 Answers

You're right: the StateT monad transformer will do the trick:

imapM :: (Traversable t, Monad m)
          => (Int -> a -> m b) -> t a -> m (t b)
imapM f = flip runStateT 0 .
  mapM (\a ->
    do
      count <- get
      put $! count + 1
      f count a)

But writing the specialized version for lists might be better:

itraverse :: Applicative f
          => (Int -> a -> f b) -> [a] -> f [b]
itraverse f = go 0 where
  go !_ [] = pure []
  go !count (x:xs) = (:) <$> f count x <*> go (count + 1) xs
like image 174
dfeuer Avatar answered Sep 18 '22 07:09

dfeuer


I've implemented a recursive solution that I'm sure is very inefficient. I'd still be interested to see the 'proper' way of doing this.

replaceLabels :: [Statement] -> Either String [I.Instruction]
replaceLabels [] = Right []
replaceLabels stmts@(s:ss) = replaceLabels' labels stmts 0
    where labels = process stmts

replaceLabels' :: [(String, Int)] -> [Statement] -> Int -> Either String [I.Instruction]
replaceLabels' _ [] _ = Right []
replaceLabels' labels (s:ss) addr = do
                                instructions <- stmtToInstruction addr labels s
                                restInstructions <- replaceLabels' labels ss (addr + length instructions)
                                return (instructions ++ restInstructions)
like image 32
rossng Avatar answered Sep 20 '22 07:09

rossng