Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing comma-separated list of ranges and numbers with semantic actions

Using Boost.Spirit X3, I want to parse a comma-separated list of ranges and individual numbers (e.g. 1-4, 6, 7, 9-12) into a single std::vector<int>. Here's what I've come up with:

namespace ast {
    struct range 
    {
        int first_, last_;    
    };    

    using expr = std::vector<int>;    
}

namespace parser {        
    template<typename T>
    auto as_rule = [](auto p) { return x3::rule<struct _, T>{} = x3::as_parser(p); };

    auto const push = [](auto& ctx) { 
        x3::_val(ctx).push_back(x3::_attr(ctx)); 
    };  

    auto const expand = [](auto& ctx) { 
        for (auto i = x3::_attr(ctx).first_; i <= x3::_attr(ctx).last_; ++i) 
            x3::_val(ctx).push_back(i);  
    }; 

    auto const number = x3::uint_;
    auto const range  = as_rule<ast::range> (number >> '-' >> number                   ); 
    auto const expr   = as_rule<ast::expr>  ( -(range [expand] | number [push] ) % ',' );
} 

Given the input

    "1,2,3,4,6,7,9,10,11,12",   // individually enumerated
    "1-4,6-7,9-12",             // short-hand: using three ranges

this is successfully parsed as ( Live On Coliru ):

OK! Parsed: 1, 2, 3, 4, 6, 7, 9, 10, 11, 12, 
OK! Parsed: 1, 2, 3, 4, 6, 7, 9, 10, 11, 12, 

Question: I think I understand that applying the semantic action expand to the range part is necessary, but why do I also have to apply the semantic action push to the number part? Without it (i.e. with a plain ( -(range [expand] | number) % ',') rule for expr, the individual numbers don't get propagated into the AST ( Live On Coliru ):

OK! Parsed: 
OK! Parsed: 1, 2, 3, 4, 6, 7, 9, 10, 11, 12, 

Bonus Question: do I even need semantic actions at all to do this? The Spirit X3 documentation seems to discourage them.

like image 229
TemplateRex Avatar asked Oct 18 '22 18:10

TemplateRex


1 Answers

The FAQ of this that semantic actions suppress automatic attribute propagation. The assumption being that the semantic action will take care of it instead.

In general there are two approaches:

  • either use operator%= instead of operator= to assign the definition to the rule

  • or use the third (optional) template argument to the rule<> template, which can be specified as true to force automatic propagation semantics.


Simplified sample

Here, I simplify mostly by removing the semantic action inside the range rule itself. Now, we can drop the ast::range type altogether. No more fusion adaptation.

Instead we use the "naturally" synthesized attribute of numer>>'-'>>number which is a fusion sequence of ints (fusion::deque<int, int> in this case).

Now, all that's left to make it work, is to make sure the branches of | yield compatible types. A simple repeat(1)[] fixes that.

Live On Coliru

#include <boost/spirit/home/x3.hpp>
#include <iostream>

namespace x3 = boost::spirit::x3;

namespace ast {
    using expr = std::vector<int>;    

    struct printer {
        std::ostream& out;

        auto operator()(expr const& e) const {
            std::copy(std::begin(e), std::end(e), std::ostream_iterator<expr::value_type>(out, ", "));;
        }
    };    
}

namespace parser {        
    auto const expand = [](auto& ctx) { 
        using boost::fusion::at_c;

        for (auto i = at_c<0>(_attr(ctx)); i <= at_c<1>(_attr(ctx)); ++i) 
            x3::_val(ctx).push_back(i);  
    }; 

    auto const number = x3::uint_;
    auto const range  = x3::rule<struct _r, ast::expr> {} = (number >> '-' >> number) [expand]; 
    auto const expr   = x3::rule<struct _e, ast::expr> {} = -(range | x3::repeat(1)[number]  ) % ',';
} 

template<class Phrase, class Grammar, class Skipper, class AST, class Printer>
auto test(Phrase const& phrase, Grammar const& grammar, Skipper const& skipper, AST& data, Printer const& print)
{
    auto first = phrase.begin();
    auto last = phrase.end();
    auto& out = print.out;

    auto const ok = phrase_parse(first, last, grammar, skipper, data);
    if (ok) {
        out << "OK! Parsed: "; print(data); out << "\n";
    } else {
        out << "Parse failed:\n";
        out << "\t on input: " << phrase << "\n";
    }
    if (first != last)
        out << "\t Remaining unparsed: '" << std::string(first, last) << '\n';    
}

int main() {
    std::string numeric_tests[] =
    {
        "1,2,3,4,6,7,9,10,11,12",   // individually enumerated
        "1-4,6-7,9-12",             // short-hand: using three ranges
    };

    for (auto const& t : numeric_tests) {
        ast::expr numeric_data;
        test(t, parser::expr, x3::space, numeric_data, ast::printer{std::cout});
    }
}

Prints:

OK! Parsed: 1, 2, 3, 4, 6, 7, 9, 10, 11, 12, 
OK! Parsed: 1, 2, 3, 4, 6, 7, 9, 10, 11, 12, 
like image 150
sehe Avatar answered Oct 21 '22 16:10

sehe