Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Boost.Spirit: Lex + Qi error reporting

I am writing a parser for quite complicated config files that make use of indentation etc. I decided to use Lex to break input into tokens as it seems to make life easier. The problem is that I cannot find any examples of using Qi error reporting tools (on_error) with parsers that operate on stream of tokens instead of characters.

Error handler to be used in on_error takes some to be able to indicate exactly where the error is in the input stream. All examples just construct std::string from the pair of iterators and print them. But if Lex is used, that iterators are iterators to the sequence of tokens, not characters. In my program this led to hang in std::string constructor before I noticed invalid iterator type.

As I understand token can hold a pair of iterators to the input stream as its value. This is the default attribute type (if type is like lex::lexertl::token<>). But if I want my token to contain something more useful for parsing (int, std::string, etc), those iterators are lost.

How can I produce human friendly error messages indicating position in the input stream while using Lex with Qi? Are there any examples of such usage?

Thanks.

like image 680
Paul Graphov Avatar asked May 11 '11 12:05

Paul Graphov


1 Answers

Sorry for the late reply, but it took me some time to prepare a decent example of what you're trying to achieve. I now added a new lexer example to Spirit: conjure_lexer. It is a modified version of the conjure (Qi) example implementing a small programming language. The main difference is that it is using a lexer instead of a pure Qi grammar.

The new conjure_lexer example demonstrates several things: a) it is using a new position_token class, which extends the existing token type. It always stores the pair of iterators pointing to the corresponding matched input sequence (in addition to the usual information like token id, token value, etc.). b) it is using this positional information for error reporting c) and along the lines, it demonstrates how using a lexer can simplify the grammar.

The new example is in SVN (trunk) and will be available in Boost V1.47 (to be released soon). It's in this directory: $BOOST_ROOT/libs/spirit/example/qi/compiler-tutorial/conjure_lexer.

like image 90
hkaiser Avatar answered Nov 01 '22 14:11

hkaiser