Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to skip line/block/nested-block comments in Boost.Spirit?

When parsing a language using Boost.Spirit, how can I ensure that I skip

// line comments

/* block
   comments */ and

/* /* nested
   block */ comments */

when reading in the code? At the moment, I just do a phrase_parse into a predefined qi::grammar. I guess what I need is some sort of skipping lexer, right?

like image 439
Dmitri Nesteruk Avatar asked Mar 08 '23 20:03

Dmitri Nesteruk


1 Answers

No lexers required.

Here's a sample grammar that implements it: Cross-platform way to get line number of an INI file where given option was found, but regardless you can use a skipper like this:

using Skipper = qi::rule<Iterator>;

Skipper block_comment, single_line_comment, skipper;

single_line_comment = "//" >> *(char_ - eol) >> (eol|eoi);
block_comment = "/*" >> *(block_comment | char_ - "*/") > "*/";

skipper = single_line_comment | block_comment;

Of course if white-space is also skippable, use

skipper = space | single_line_comment | block_comment;

This supports nested block-comments, throwing qi::expectation_failure<> if there is a missing */.

Note that it specifically doesn't support block comments starting in a single-line-comment.

Demo

Live On Coliru

#include <boost/spirit/include/qi.hpp>
namespace qi = boost::spirit::qi;

int main() {
    using Iterator = boost::spirit::istream_iterator;
    using Skipper  = qi::rule<Iterator>;

    Skipper block_comment, single_line_comment, skipper;

    {
        using namespace qi;
        single_line_comment = "//" >> *(char_ - eol) >> (eol|eoi);
        block_comment       = ("/*" >> *(block_comment | char_ - "*/")) > "*/";

        skipper             = space | single_line_comment | block_comment;
    }

    Iterator f(std::cin >> std::noskipws), l;

    std::vector<int> data;
    bool ok = phrase_parse(f, l, *qi::int_, skipper, data);
    if (ok) {
        std::copy(data.begin(), data.end(), std::ostream_iterator<int>(std::cout << "Parsed ", " "));
        std::cout << "\n";
    } else {
        std::cout << "Parse failed\n";
    }

    if (f!=l) {
        std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
    }
}

Which prints:

Parsed 123 456 567 901 

Given the input

123 // line comments 234

/* block 345
   comments */ 456

567

/* 678 /* nested
   789 block */ comments 890 */

901
like image 64
sehe Avatar answered Mar 24 '23 22:03

sehe