I'm writing a very simple two-pass assembler in Haskell and I've come across a scenario that I don't yet have the experience to solve. I think the solution is likely to involve monad transformers, which I don't really understand.
The assembler parses the assembly code into a list of Statement
s, which are either instructions or labels. Some Statement
s may refer to labels. The assembler needs to convert the Statement
s into Instruction
s, which involves eliminating the labels and substituting the label references with an appropriate value.
I have written the first pass of the assembler, which produces a [(String, Int)]
representing a map from labels to addresses. I have also written the following function for translating a Statement
into an Instruction
:
stmtToInstruction :: Int -> [(String, Int)] -> Statement -> Either String [I.Instruction]
stmtToInstruction addr labels stmt = case stmt of
ADD d s1 s2 -> Right [I.ADD d s1 s2]
BEQL s1 s2 l -> case do label <- find (\e -> fst e == l) labels
let labelAddr = snd label
let relativeAddr = I.ImmS $ fromIntegral (labelAddr - addr)
return (I.BEQ s1 s2 relativeAddr) of
Just i -> Right [i]
Nothing -> Left $ "Label " ++ l ++ " not defined"
LABEL _ -> Right []
I've omitted several cases for brevity, but you can see all the possible results here:
ADD
always succeeds and produces an instructionBEQL
can either succeed or fail, depending on whether a label is foundLABEL
always succeeds, even though it produces no actual instructionsThis works as expected. The problem I now have is writing this function:
replaceLabels :: [Statement] -> Either String [I.Instruction]
replaceLabels
takes a list of statements, and runs stmtToInstruction
on each one. The addr
argument to stmtToInstruction
must be the length of the [Instruction]
accumulated so far. The output may either be a Left String
, if one of the label references was invalid, or a Right [I.Instruction]
, if there were no errors.
mapM :: Monad m => (a -> m b) -> [a] -> m [b]
gets us some of the way there, but provides no way to inject the current address into the (a -> m b)
function. How do I make this work?
You're right: the StateT
monad transformer will do the trick:
imapM :: (Traversable t, Monad m)
=> (Int -> a -> m b) -> t a -> m (t b)
imapM f = flip runStateT 0 .
mapM (\a ->
do
count <- get
put $! count + 1
f count a)
But writing the specialized version for lists might be better:
itraverse :: Applicative f
=> (Int -> a -> f b) -> [a] -> f [b]
itraverse f = go 0 where
go !_ [] = pure []
go !count (x:xs) = (:) <$> f count x <*> go (count + 1) xs
I've implemented a recursive solution that I'm sure is very inefficient. I'd still be interested to see the 'proper' way of doing this.
replaceLabels :: [Statement] -> Either String [I.Instruction]
replaceLabels [] = Right []
replaceLabels stmts@(s:ss) = replaceLabels' labels stmts 0
where labels = process stmts
replaceLabels' :: [(String, Int)] -> [Statement] -> Int -> Either String [I.Instruction]
replaceLabels' _ [] _ = Right []
replaceLabels' labels (s:ss) addr = do
instructions <- stmtToInstruction addr labels s
restInstructions <- replaceLabels' labels ss (addr + length instructions)
return (instructions ++ restInstructions)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With