Mixing the lexer and parsing phases in one phase sometimes makes Parsec parsers less readable but also slows them down. One solution is to use Alex as a tokenizer and then Parsec as a parser of the token stream.
This is fine but it would be even better if I could get rid of Alex because it adds one preprocessing phase in the compilation pipeline, doesn't integrate well with haskell "IDEs", etc. I was wondering if there was such a thing as an haskell EDSL for describing tokenizers, very much in the style of Alex, but as a library.
Yes - http://www.cse.unsw.edu.au/~chak/papers/Cha99.html
Before Hackage, Manuel used to release the code in a package called CTK (compiler toolkit). I'm not sure what the status of project is these days.
I think Thomas Hallgren's lexer from the paper "Lexing Haskell in Haskell" was dynamic rather than a code generator, whilst the release is tailored to lexing Haskell the machinery in the library is more general. Iavor Diatchki has put the code on Hackage.
http://hackage.haskell.org/package/haskell-lexer
You can use Parsec as the lexer too. First you parse the string into tokens, then you parse the tokens into the target data type.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With