Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Alex in Haskell to make a lexer that parses Dice Rolls

I'm making a parser for a DSL in Haskell using Alex + Happy. My DSL uses dice rolls as part of the possible expressions.

Sometimes I have an expression that I want to parse that looks like:

[some code...]  3D6  [... rest of the code]

Which should translate roughly to:

TokenInt {... value = 3}, TokenD, TokenInt {... value = 6}

My DSL also uses variables (basically, Strings), so I have a special token that handle variable names. So, with this tokens:

"D"                                 { \pos str -> TokenD pos }
$alpha [$alpha $digit \_ \']*       { \pos str -> TokenName pos str}
$digit+                             { \pos str -> TokenInt pos (read str) }

The result I'm getting when using my parse now is:

TokenInt {... value = 3}, TokenName { ... , name = "D6"}

Which means that my lexer "reads" an Integer and a Variable named "D6".

I have tried many things, for example, i changed the token D to:

$digit "D" $digit                   { \pos str -> TokenD pos }

But that just consumes the digits :(

  • Can I parse the dice roll with the numbers?
  • Or at least parse TokenInt-TokenD-TokenInt?

PS: I'm using PosN as a wrapper, not sure if relevant.

like image 826
Zeb Avatar asked Jul 13 '20 04:07

Zeb


1 Answers

The way I'd go about it would be to extend the TokenD type to TokenD Int Int so using the basic wrapper for convenience I would do

$digit+ D $digit+ { dice }
...
dice :: String -> Token
dice s = TokenD (read $ head ls) (read $ last ls)
  where ls = split 'D' s

split can be found here.

This is an extra step that'd usually be done in during syntactic analysis but doesn't hurt much here.

Also I can't make Alex parse $alpha for TokenD instead of TokenName. If we had Di instead of D that'd be no problem. From Alex's docs:

When the input stream matches more than one rule, the rule which matches the longest prefix of the input stream wins. If there are still several rules which match an equal number of characters, then the rule which appears earliest in the file wins.

But then your code should work. I don't know if this is an issue with Alex.

like image 182
Mihalis Avatar answered Nov 19 '22 04:11

Mihalis