Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding Boost.spirit's string parser

#include <iostream>
#include <boost/spirit/include/qi.hpp>

namespace qi = boost::spirit::qi;
int main ()
{
    using qi::string;

    std::string input("a");
    std::string::iterator strbegin = input.begin();
    std::string p;
    bool ok = qi::phrase_parse(strbegin, input.end(),
            ((string("a")  >> string("a")) | string("a")),
            qi::space,                  
            p);                               

    if (ok && strbegin == input.end()) {
        std::cout << p << std::endl;
        std::cout << p.size() << std::endl;
    } else {
        std::cout << "fail" << std::endl;
        std::cout << std::string(strbegin, input.end()) << std::endl;
    }
}

This program outputs aa. How is it possible? Input string is a. Parser should match aa or a. I have written string("a") only for testing operators.

The same is when using char_ instead of string.

like image 589
Ashot Avatar asked Feb 22 '14 19:02

Ashot


1 Answers

It's not the string matcher per se. It's [attribute propagation] + [backtracking] in action.

A string attribute is a container attribute and many elements could be assigned into it by different parser subexpressions. Now for efficiency reasons, Spirit doesn't rollback the values of emitted attributes on backtracking.

Often this is no problem at all, but as you can see, the 'a' from the failed first branch of the alternative sticks around.

Either reword or employ the 'big gun' qi::hold[] directive:

(qi::hold [ string("a")  >> string("a") ] | string("a")),

Rewording could look like:

qi::string("a") >> -qi::string("a"),

Also, if you're really just trying to match certain textual strings, consider:

(qi::raw [ qi::lit("aa") | "a" ]), 
// or even just
qi::string("aa") | qi::string("a"),

Now which one of these applies most, depends on your grammar.

like image 87
sehe Avatar answered Nov 06 '22 05:11

sehe