Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tracking Position when Scanning Tokens complicates Parser

I am writing a two pass parser where I first scan the text in to tokens (using Alex) then parse those tokens (using Parsec). All well and good until I tried to add position information to the tokens so I can write a good error message.

Originally I had:

data Token = TAtom | TString String | TInt Integer | TFloat [...]

It seems like I can either add a Position element to each Token constructor or create a new type like data TokenWithPosition = T Token Position.

I have started down the latter path, but now I have a problem of either having to create a TokenWithPosition with a fake position when I want to describe a token in Parsec, or I have to unwrap the TokenWithPosition every time I want to make a comparison. In short my nice clean grammar is being overrun with code needed to ignore the position information.

So my question: Is there a clean way to track position information without having it complicate the parser in the second pass? This seems like something that would have a standard solution.

like image 310
John F. Miller Avatar asked Apr 19 '13 02:04

John F. Miller


1 Answers

You need to use functions from Text.Parsec.Prim (for instance, tokenPrim) to implement your own "primitive parsers".

Those primitive parsers will update Parsec's internal state with the position information and return a pure Token without the position.

like image 162
Roman Cheplyaka Avatar answered Nov 07 '22 06:11

Roman Cheplyaka